What I Worked On – Naman Bhoj

Full Stack

▶ 1. ForYourResearch 2023 – 2025

React TailwindCSS Django PostgreSQL AWS LangChain Pinecone OpenAI Docker GitHub Actions Scrapy Airflow

Customer: Researchers conducting systematic literature reviews (35 ICPs at UofC, 50 stakeholders engaged).
What we did: Automated web app using scraping and text analysis for data collection.
Status Quo: Traditional manual approach (12–18 months, $30,000–50,000) vs. automated approach (1 week, hundreds of dollars).
Outcomes: 98% timeline reduction and 99% cost reduction.
Context: Evidence-based research requiring efficient literature review processes. Conducted customer discovery to identify beachhead market and develop quantified value proposition.

▶ 2. Vaccine Rules Intelligence Pipeline 2025

FastAPI AWS Lambda SQS SNS SES S3 EventBridge Glue Athena CDK Claude API Retool

Customer: Teams or individuals needing automated AI-generated recommendations delivered daily across multiple regions.
What we did: Built a serverless, event-driven pipeline that processes AI prompts through Claude, fans results out to email and a queryable data lake, and runs fully automated on a daily cron — no manual intervention required.
Status Quo: Manually running prompts, copy-pasting outputs, emailing results, and managing spreadsheets vs. a fully automated pipeline that ingests, processes, stores, and delivers AI results across any number of countries daily.
Outcomes: 100% elimination of manual prompt-to-delivery workflow. Horizontally scalable to any number of jobs with zero additional infrastructure changes.
Architecture: FastAPI (entry point, rate limiting) → SQS (job_queue) → Lambda (claude_worker) → Claude API → SNS fan-out → [email_queue → Lambda → SES] + [s3_queue → Lambda → S3 Bronze]. EventBridge triggers daily scheduler and Glue ETL (Bronze JSON → Silver Parquet). Athena queries Silver layer; Retool surfaces results. All infra via CDK.

▶ 3. Pawitraa 2019 – 2021

React Django PostgreSQL Digital Ocean TailwindCSS Docker

Customer: Students in Himalayan regions of India lacking access to quality education.
What we did: Built web platform delivering evidence-based educational content in Maths, Life Sciences, and Computer Science.
Status Quo: Limited access to quality education resources in remote regions.
Outcomes: Bridged accessibility gap through scalable digital platform.
Context: Non-profit tech initiative addressing educational inequity in underserved communities.

▶ 4. Real-Time Digital Twin via LiDAR + iPad 2023 – 2024

Unity C# Apple LiDAR ARKit Gaussian Splatting LangChain OpenAI Docker GitHub Actions

Customer: Construction teams, facility managers, and AR collaborators requiring spatial understanding for TELUS Communication customer support workers.
What we did: Built real-time pipeline capturing spatial data via iPad LiDAR and rendering as 3D digital twins using Gaussian Splatting.
Status Quo: Manual site documentation and remote collaboration limited by 2D representations.
Outcomes: Real-time spatial capture enabling immersive digital twins for analysis and collaboration.
Context: Applications in smart construction, facility management, and distributed AR collaboration.

▶ 5. End-to-End VR Remote Support with LLM Avatar 2024 – 2025

Unity Meta Quest LLM Speech-to-Text ARKit Docker GitHub Actions

Customer: Remote support teams and field technicians requiring hands-free, context-aware assistance at TELUS Communication.
What we did: Built end-to-end pipeline scanning physical environments with iPad LiDAR, reconstructing as digital twins, streaming to Meta Quest with LLM-powered voice interactions.
Status Quo: Traditional remote support lacks spatial context and requires manual reference materials.
Outcomes: Hands-free, context-aware VR experiences enabling remote experts to assist with full environmental understanding.
Context: VR-based remote support for field operations requiring spatial reasoning and collaborative problem-solving.

▶ 6. Doc to HTML 2025

Next.js FastAPI OpenAI TypeScript Docker

Customer: Organizations in mission-critical industries (Healthcare, Defense, Law) managing internal documentation.
What we did: Built rapid prototype converting internal docs to structured HTML guides using AI.
Status Quo: Manual documentation conversion is time-consuming and error-prone.
Outcomes: Prototype developed within 6 hours; streamlines multi-guide conversion to HTML.
Context: Early-stage prototype; future improvements include semantic search for document clustering and comprehensive guide generation.

AI / Data Science

▶ 7. Time-Series Energy Consumption Prediction for Smart Homes 2022

Python TensorFlow Keras PyTorch scikit-learn AWS SageMaker Docker

Customer: Utility companies and residential homeowners in smart home ecosystems.
What we did: Developed advanced time-series forecasting models for energy consumption prediction.
Status Quo: Traditional forecasting methods lack accuracy for demand-response optimization.
Outcomes: Improved energy usage predictions enabling optimized power generation, smarter demand-response systems, and reduced carbon footprints.

▶ 8. LSTM-Powered Identification of Clickbait Content 2021

Python TensorFlow Keras LSTM Random Forest NLP AWS SageMaker

Customer: Online content consumers and platforms combating misinformation.
What we did: Built NLP-based detection system using LSTM neural networks for clickbait identification.
Outcomes: LSTM model achieved 95.03% accuracy, outperforming Random Forest (93.89%) and Naive Bayes (93.32%).

▶ 9. AI-Powered, Low-Latency Intrusion Detection for Power Systems 2021

Python scikit-learn Random Forest Gradient Boosting SVM AWS SageMaker

Customer: Power grid operators and critical infrastructure protecting against cyber threats.
What we did: Developed ML-based intrusion detection using feature selection and ensemble methods.
Outcomes: 11.874% accuracy improvement using Random Forest feature selection with SVM, requiring only 30 features.

▶ 10. Spam Job Posting Detection via Employer Linguistic Features 2022 · ICAIC

Python NLP scikit-learn TF-IDF

Customer: Job seekers and recruitment platforms combating fraudulent listings.
What we did: Designed a linguistic feature extraction pipeline from employer-authored text to classify spam job postings.
Outcomes: Effective classification using domain-specific linguistic cues derived from employer language patterns.

▶ 11. Feature Selection and Scaling for Random Forest Malware Detection 2021 · IEEE CSNT

Python scikit-learn Random Forest SMOTE

Customer: Cybersecurity teams and endpoint protection platforms.
What we did: Evaluated feature selection and scaling strategies to optimize a Random Forest malware detection pipeline.
Outcomes: Targeted feature selection significantly improves detection accuracy and reduces model complexity.

▶ 12. Improved Identification of Negative Covid-19 Vaccination Tweets 2021 · IEEE CICN

Python NLP LSTM SMOTE TensorFlow

Customer: Public health agencies monitoring vaccine sentiment.
What we did: Built a sentiment classification pipeline to identify negative vaccine-related tweets, addressing severe class imbalance.
Outcomes: Improved recall and F1 on negative tweet detection by applying SMOTE and re-weighting strategies.

▶ 13. Naive and Neighbour Approach for Phishing Detection 2021 · IEEE CSNT

Python scikit-learn Naive Bayes KNN

Customer: End users and browser security platforms targeted by phishing.
What we did: Built and benchmarked Naive Bayes and KNN classifiers on URL and page-level features.
Outcomes: Lightweight ML models demonstrated as viable real-time phishing detectors with low inference overhead.

▶ 14. Comparative Feature Selection for Malicious Website Detection 2021 · RS Open Journal

Python scikit-learn SMOTE Random Forest Decision Tree

Customer: Browser security vendors and enterprise network defenders.
What we did: Comparative analysis of feature selection techniques on SMOTE-balanced datasets for malicious website detection.
Outcomes: Identified optimal feature selection approaches that maximize detection precision after data balancing.

▶ 15. ML Framework for Security and Privacy in Social Networking 2023 · Cluster Computing

Python scikit-learn Graph ML NLP

Customer: Social network operators and users requiring trust and privacy guarantees.
What we did: Designed a ML framework addressing security threats and privacy leakage in social network data.
Outcomes: Proposed and validated a cohesive framework integrating threat detection, anomaly scoring, and privacy-aware data handling.

▶ 16. Challenges and Opportunities in Edge Computing Architecture Using ML 2022 · AI and ML for Edge Computing

Python TensorFlow Lite Edge ML

Customer: IoT system architects and edge infrastructure engineers.
What we did: Surveyed ML deployment strategies for edge computing environments, identifying key architectural trade-offs.
Outcomes: Mapped opportunities for lightweight model deployment at the edge, informing architecture decisions for constrained environments.

▶ 17. Tree-Based Classification of Firewall Logs to Dodge Intrusion 2021 · IEEE CSNT

Python scikit-learn Decision Tree Random Forest

Customer: Network security operations teams monitoring firewall activity.
What we did: Applied tree-based classifiers to structured firewall log data to detect and classify intrusion attempts.
Outcomes: Tree-based models accurately classify intrusion events from raw firewall logs with low overhead.

▶ 18. Ensemble, Distance and Tree Methods for Secure Power Systems 2021 · ICDABI

Python scikit-learn Gradient Boosting KNN Ensemble Methods

Customer: Power grid security engineers evaluating ML-based anomaly detection.
What we did: Benchmarked ensemble, distance-based, and tree-based classifiers for detecting anomalies in power system data.
Outcomes: Empirical guidance on model selection for power system security, identifying performance trade-offs across families.

▶ 19. NLP to Identify Potential Medical Conditions from Natural Language 2021

Python NLP LSTM BERT TensorFlow

Customer: Healthcare providers and clinical decision support systems.
What we did: Explored NLP-based approaches to extract and classify medical conditions from natural language text.
Outcomes: Intelligent system capable of identifying potential medical conditions from patient-authored and clinical text.

▶ 20. Robust Malware Detection Using Machine Learning 2021 · IEEE CSNT

Python scikit-learn Random Forest SVM Feature Engineering

Customer: Endpoint security vendors and enterprise IT defense teams.
What we did: Built and evaluated a robust ML pipeline for malware detection using static and behavioral features.
Outcomes: ML-based pipeline demonstrated strong generalization against unseen malware families with high detection rates.