Skip to content
View shreyamalogi's full-sized avatar
:octocat:
Keep Hustling, Keep Shining!!
:octocat:
Keep Hustling, Keep Shining!!
  • CodeMacrocosm

Block or report shreyamalogi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
shreyamalogi/README.md

👋 Hello! I'm SHREYA MALOGI

MSc Data Analytics @ BSBI Berlin | Data Scientist | ML Engineer
Architecting Scalable Forecasting Engine & Industrial Computer Vision


🚀 Career Objective & Immediate Availability

Actively Interviewing: Seeking Full-Time Data Science / ML Engineer roles.

Location: Hyderabad, India (Available for Local & GCC Roles) | Globally Mobile.

Availability: Immediate Start.


📊 Engineering Highlights

  • Big Data Architect: Engineered a 15.2M record pipeline using PySpark and GCP Dataproc.
  • Memory Optimization: Achieved 70% RAM reduction via advanced data downcasting and feature engineering.
  • Computer Vision: Developed an industrial-grade ReUNet model achieving 91.7% Accuracy.
  • Technical Leadership: Former Founder & Lead Developer at CodeMacrocosm; scaled an open-source community to 2,100+ members and 1.2k+ Stars.

Technical Toolbox (Ranked by Industry Demand)

Category Tools & Technologies
Machine Learning Python, Scikit-Learn, LightGBM, XGBoost, TensorFlow, OpenCV
Data Engineering PySpark, GCP (Dataproc/BigQuery), SQL (PostgreSQL), ETL Pipelines
Statistical Research Predictive Modeling, Time-Series Forecasting, Sales Analytics
Software Ops Git/GitHub, DSA (O(n) Optimization), Docker, Flask API

🏆 Featured Core Architectures

Scale: 15.2 Million Transactions | Tech: PySpark, LightGBM, GCP

  • The Problem: High-latency and memory crashes during large-scale retail demand forecasting.
  • The Solution: Implemented a Tweedie-loss LightGBM model with a memory-optimized data loader.
  • The Result: 70% less memory usage and 15% higher accuracy than baseline models.
  • The Description: A Strategic HR Intelligence Workspace leveraging Behavioral Analytics to predict employee attrition. Features a flight-risk engine analyzing Retention Rates, Income-to-Attrition correlation, and patterns.
  • Tech Stack: Python, Advanced SQL, Data Modeling
  • The Description: A high-precision Computer Vision workspace featuring ReUNet for medical segmentation and MobileNetV2 for AgTech classification. Demonstrating cross-domain AI expertise in Healthcare Diagnostics.
  • Tech Stack: PyTorch, Jupyter Notebook, OpenCV

More Production & Data Engineering Repositories

🔍 Click to View Additional AI, Big Data & Analytics Projects

Business Intelligence & Big Data Pipelines

  • Retail-Sales-Intelligence-Dashboard
    • Impact: Architected a multi-branch retail performance dashboard tracking Total Sales, Gross Income, and Tax (5%) across diverse product categories and cities.
  • Amazon-BigData-Verified-Review-Classifier
    • Impact: Scalable Trust-Signal Detection: A Big Data pipeline using PySpark and GCP Dataproc to classify 8GB+ of Amazon reviews with high-precision Random Forest modeling. Engineered for horizontal scalability using Hive bucket partitioning.
  • Retail-Data-Engineering-Pipeline
    • Impact: Scalable ETL Pipeline: Processing 5M+ retail records with PySpark on GCP Dataproc. Automated the extraction of global business KPIs and consumer trends. Includes an Ethical Data Framework.

Interactive Systems & Machine Learning Repos

  • Biometric-Attendance-Engine
    • Impact: Real-time face recognition system using HOG encodings and Dlib landmarks. Features a high-speed Flask/OpenCV pipeline for live video processing and automated SQL database logging.
  • Intelligent-Travel-Recommendation-Engine
    • Impact: An Intelligent Travel Recommendation Engine using TF-IDF Vectorization and KNN to predict optimal tourist destinations. Features a modular Python/Tkinter architecture.
  • Mushroom-Classification-Predictive-Analytics
    • Impact: High-precision predictive classification achieving 100% accuracy using Random Forest & XGBoost. Optimized via GridSearchCV to ensure zero-false-negative outcomes in safety-critical settings.

Comprehensive Portfolio & Creative Works

  • The Description: A centralized showcase highlighting my versatility across Software Development, Tech Leadership/Direction, and Creative Engineering projects. It acts as a curated window into my most impactful and multifaceted work.

DSA & Coding Philosophy

I write Production-Grade Python. I focus on $O(n)$ time complexity and memory-efficient data structures to ensure ML models scale seamlessly from 1k to 15M+ records.


📫 Let's Connect

LinkedIn Email

Other Tools:

aws bootstrap c cplusplus css3 dart django docker express figma firebase flask flutter git graphql heroku html5 java javascript kotlin mongodb mysql nextjs nodejs opencv php postman python react redux spring

An image of @5hre9a's Holopin badges, which is a link to view their full Holopin profile

+@ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @+
@@       o o                                           @@
@@       | |                                           @@
@@      _L_L_                                          @@
@@   ❮\/__-__\/❯ Programming isn't about what you know @@
@@   ❮(|~o.o~|)❯  It's about what you can figure out   @@
@@   ❮/ \`-'/ \❯                                       @@
@@     _/`U'\_                                         @@
@@    ( .   . )     .----------------------------.     @@
@@   / /     \ \    | while( ! (succed=try() ) ) |     @@
@@   \ |  ,  | /    '----------------------------'     @@
@@    \|=====|/                                        @@
@@     |_.^._|                                         @@
@@     | |"| |                                         @@
@@     ( ) ( )   Testing leads to failure              @@
@@     |_| |_|   and failure leads to understanding    @@
@@ _.-' _j L_ '-._                                     @@
@@(___.'     '.___)                                    @@
+@ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @+

Pinned Loading

  1. Industrial-Demand-Forecasting-Pipeline Industrial-Demand-Forecasting-Pipeline Public

    Architected a high-performance predictive pipeline processing 15 Million transactions. Optimized memory by 70% via custom downcasting and implemented Tweedie-LightGBM to solve zero-inflation in ret…

    Jupyter Notebook 14

  2. Amazon-BigData-Verified-Review-Classifier Amazon-BigData-Verified-Review-Classifier Public

    Scalable Trust-Signal Detection: A Big Data pipeline using PySpark and GCP Dataproc to classify 8GB+ of Amazon reviews with high-precision Random Forest modeling. Engineered for horizontal scalabil…

    Python 3

  3. Multi-Domain-CV-Intelligence-Workspace Multi-Domain-CV-Intelligence-Workspace Public

    A high-precision Computer Vision workspace featuring ReUNet for medical segmentation and MobileNetV2 for AgTech classification. Demonstrating cross-domain AI expertise in Healthcare Diagnostics and…

    Jupyter Notebook 1

  4. Retail-Data-Engineering-Pipeline Retail-Data-Engineering-Pipeline Public

    Scalable ETL Pipeline: Processing 5M+ retail records with PySpark on GCP Dataproc. Automated the extraction of global business KPIs and consumer trends. Includes an Ethical Data Framework to ensure…

    Python 14

  5. Biometric-Attendance-Engine Biometric-Attendance-Engine Public

    Real-time face recognition system using HOG encodings and Dlib landmarks. Features a high-speed Flask/OpenCV pipeline for live video processing and automated SQL database logging

    HTML 16 1

  6. Intelligent-Travel-Recommendation-Engine Intelligent-Travel-Recommendation-Engine Public

    An Intelligent Travel Recommendation Engine using TF-IDF Vectorization and KNN to predict optimal tourist destinations. Features a modular Python/Tkinter architecture and mathematical similarity sc…

    Python 8