Skip to content
View Sho670's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report Sho670

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Sho670/README.md

πŸš€ Hi, I'm SHOHOM CHAKRABORTY

A Data Enginner who loves building, breaking, and learning things πŸ’»


🌟 About Me

  • πŸ”­ I’m currently working on my Major Project: Cyber Security based Project for Automated Threat Detection using Docker and AI Workflow Automation

  • 🌱 I’m currently learning: Workflow Orchestration, Data Pipeling, Amazon Web Services

  • πŸ’‘ I love solving real-world problems with code

  • ⚑ Fun fact: I enjoy debugging more than writing initial code πŸ˜„


🧠 Skills

Programming Languages:
SQL β€’ NoSQL . Python β€’ Java

Data and Machine Learning Libraries:
Pandas β€’ Numpy β€’ Scikit-learn

Visualization Tools:
Microsoft Power BI β€’ Tableau. Seaborn . Plotly. Mathplotlib

Database:
MongoDB β€’ MongoDB Compass . MySQL . MySQL Server

Tools:
Git β€’ GitHub β€’ Google Antigravity . Visual Studio Code β€’ Apache Spark . Apache Airflow . Apache Kafka . Databricks


πŸš€ Top 4 Projects

πŸ”Ή Project One: Amazon Prime Content Analysis: A MongoDB driven data analysis project

A expanded data analysis project for Amazon Prime Dataset from the year 1920-2021

Experienced with dataset of containing over 1,00,000 + records of large-data which include, different kinds of Movies and TV Shows.

The project also led towards various informative analysis of the Amazon prime data present in the dataset using various Aggregation Pipelines created in MongoDB Compass.

πŸ”Ή Project Two: Automated Sales ETL Pipeline: A Batch processing based ETL Project

Deployed an Automated ETL Pipeline to process transactional sales process data, performed KPI Aggregation to stored optimized analytics dataset using Delta Lakes.

Used tools like Apache Airflow in order to work with workflow orchestration and storing processed data as Deltas.

πŸ”Ή Project Three: Company Sales Revenue Analysis using MongoDB

The project focuses on analyzing company sales and revenue data using MongoDB. The system is designed to store, process, and analyze large volumes of business transaction data efficiently.

By using MongoDB aggregation pipelines, the project generates valuable business insights such as revenue trends, product performance and regional sales analysis with the use of visualization tool as Microsoft Power BI.

The primary goal of this project is to demonstrate how MongoDB can be used for real-world business analytics and decision-making processes.

πŸ”Ή Project Four: Review Authenticity & Sentiment Analyzer- A Generative AI based Project

The project demonstrates the use of Natural Language Processing (NLP) and Machine Learning (ML) Algorithms in a real-world problem like identification of a real and genuine review.

In order to do that, there are two key dimensions that are often target are authenticity (detecting fake or bot-generated reviews) and sentiment (determining if the review expresses positive, negative, or neutral emotions).

Both tasks involve distinct methods but can be integrated for a comprehensive review assessment.

Tools Used: Scikit-Learn, Hugging-Faced Tranformers Model, Pandas, Numpy, & More...


πŸ“ˆ GitHub Journey

  • 🌱 Started learning coding since 2020.

  • πŸ’» Build my first project during the initial years of the GitHub Journey.

  • πŸš€ Contributed 400+ GitHub Commits and counting. Improving daily with different and meaningful information and insights.


🌐 Connect With Me


πŸ’­ Quote I Live By

"Consistency works with Discipline."


🎯 Current Focus

Improving my skills one commit at a time; Learning from Contribution one at a time!!!

Pinned Loading

  1. Amazon_Content-Analysis---A-MongoDB-based-Project Amazon_Content-Analysis---A-MongoDB-based-Project Public

    The project focuses on analyzing an OTT (Over-The-Top) platform dataset using MongoDB. The aim is to perform data cleaning, transformation, and exploratory data analysis to extract meaningful insig…

    1

  2. Company-Sales-Revenue-Analysis-Using-MongoDB Company-Sales-Revenue-Analysis-Using-MongoDB Public

    This project focuses on analyzing company sales and revenue data using MongoDB, a NoSQL document-oriented database. The system is designed to store, process, and analyze large volumes of business t…

    1

  3. Automated-Sales-ETL-Pipeline Automated-Sales-ETL-Pipeline Public

    This project is an automated ETL pipeline that collects data from the dataset provided, transforms , and loads it into a warehouse for analytics. It uses tools like Apache Spark and Databricks for …

    1

  4. Data-Supply-Optimization-System---A-Machine-Learning-Predictive-Model Data-Supply-Optimization-System---A-Machine-Learning-Predictive-Model Public

    The β€œAI-Driven Data Supply Optimization System” is typically a machine learning platform that predicts demand, optimizes inventory/data flow, and automates supply-chain decisions using batch time d…

    Python 1

  5. Review-Authenticity-Sentiment-Analyzer--A-Generative-AI-based-Project Review-Authenticity-Sentiment-Analyzer--A-Generative-AI-based-Project Public

    Generative AI systems evaluate reviews for authenticity, identifying fake or bot-generated content through text patterns, behavioral signals, and contextual consistency. Reviews classified as genui…

    Python 1

  6. Solar-Usage-Project-A-Web-Development-Based-Project- Solar-Usage-Project-A-Web-Development-Based-Project- Public

    The Solar Usage Project aims to provides meaningful insights about how efficient the Solar Energy can be essential, including people household users as well. The project also contains AI Detection …

    1