I’m a Master’s student at Columbia University in the heart of New York City, pursuing Applied Analytics at the intersection of data, technology, and business. Originally from Delhi, my journey has taken me across cities and continents, from moving 1,300 miles for undergrad to nearly 7,500 miles to New York for Master’s. With a background in Computer Science Engineering, I’ve naturally gravitated toward analytics and data-driven problem solving, where technology connects directly with real-world decisions.
Master of Science in Applied Analytics · New York City, USA · GPA 3.92/4.30
Starting my MS in Applied Analytics at Columbia University has felt like stepping into a faster gear, academically and personally. Moving from Delhi to New York came with the intensity of an Ivy League environment and the pace of a city that never really pauses. Between classes, group projects, and the constant energy around campus, I’m learning to operate at the intersection of data, technology, and business while building a routine that keeps me grounded.
The academic culture here pushes me to show up prepared every week. Surrounded by highly driven peers, faculty, and industry practitioners, expectations extend beyond grades to how clearly I can think, communicate, and defend your approach. The rigor and pace can feel demanding at times, but being in this environment has raised my standard for how I plan, execute, and learn.
Academically, the focus has been on developing strong, end-to-end analytical foundations. This includes understanding data infrastructure, transforming raw data into dashboards and reports, and communicating insights clearly to support decision-making. Alongside this, I’ve worked on training and evaluating machine learning models with an emphasis on practical application. More recently, I’ve been exploring how emerging technologies like blockchain are reshaping digital transactions, and how cloud platforms enable organizations to scale, deploy, and maintain modern data systems reliably.
Group projects have played a central role in this experience, often bringing together teammates with diverse cultural backgrounds and professional experiences. Working in these settings has helped me communicate more deliberately, align expectations early, and stay adaptable when constraints evolve. It has also strengthened my confidence in presenting ideas, incorporating feedback, and delivering outcomes under tight timelines.
What I appreciate most about NYC is the fast paced life, the city never pauses be it heavy rain, wind or blizzard, and the diversity in the crowd, across culture, religion, and life experience. Outside the classroom, the city constantly offers opportunities through networking events, talks, student clubs, and conversations that can happen in subway trains and cafes. Beyond the professional side, NYC has also become a place to explore—trying new cuisines, discovering different neighborhoods, and building a social life in a completely new environment.
Over time, I’ve learned to find structure within the city’s constant momentum. As I’ve started my second semester, I’m becoming more deliberate about managing my time and energy across academics, fitness, social life, and exploration. The balance isn’t perfect, but it’s evolving, and it’s helped me stay focused on building strong, real-world applied analytics skills.
Bachelor of Technology in Computer Science & Engineering · Vellore, Tamil Nadu, India · GPA: 3.96/4.00
Moving to Vellore Institute of Technology marked the first major shift in my life, both academically and personally. It was my first time living away from family, learning independence while sharing hostel spaces with students from diverse backgrounds. At the same time, the program laid a strong foundation in core computer science, shaping how I approached problem-solving, structured thinking, and disciplined execution during my undergraduate years.
Joining VIT was my first experience of stepping outside a familiar environment. Living on campus meant adapting quickly to new routines, shared spaces, and a diverse peer group. This transition helped me become more self-reliant and adaptable, while learning how to navigate change without losing focus on long-term goals.
The engineering program introduced a demanding academic structure defined by continuous evaluation. Regular assignments, lab work, quizzes, and examinations created a steady pace that rewarded consistency over short-term effort. This environment strengthened my discipline, time management, and ability to sustain focus across long academic cycles.
Core computer science subjects formed the technical backbone of my undergraduate experience. Through coursework and labs, I developed a strong grounding in algorithms and data structures, programming across multiple languages, database systems, and problem-solving through mathematics. Exposure to operating systems, computer networks, and basic security concepts helped me understand how software systems function end to end. Implementing solutions, debugging under constraints, and explaining my approach during evaluations helped translate theory into practical understanding.
Life on campus contributed significantly to my growth outside academics. Living in hostels and working on team projects pushed me to communicate more clearly, build new friendships, and adjust to different personalities and working styles. Managing coursework alongside everyday responsibilities improved my time management and independence, while navigating shared spaces and long-term collaborations strengthened my emotional awareness and adaptability.
Over time, I began developing a stronger interest in working with data. Tasks that involved organizing information, analyzing patterns, and understanding outcomes felt engaging and intuitive. This growing curiosity led me to explore data-focused problems more deeply and reflect on where I wanted to take my career. Eventually, it became clear that pursuing higher education in analytics would allow me to build on my technical background while pivoting toward the data side of technology.
My work experience spans data science and analytics roles where the focus has always been practical impact. I’ve worked with real-world data to build pipelines, dashboards, and analyses that help teams understand what’s happening and decide what to do next.
Data Science Intern · Remote
Analyzed a media and technology dataset based on YouTube channel performance to understand subscriber growth, content categories, earnings patterns, and geographic trends.
Role: Data Science Intern
Company: Analytics and finance education platform working with real-world, case-based datasets
Finlatics (operated by Fincrux Technologies LLP) is an analytics-focused education platform that runs structured, hands-on programs using real-world business datasets. The organization is recognized by DPIIT, a Government of India body that supports early-stage startups, and emphasizes practical, applied learning over theoretical coursework.
I worked on end-to-end exploratory data analysis of a media and technology dataset centered on YouTube channels. Using Python libraries such as Pandas and NumPy, I cleaned raw data, handled missing and inconsistent values, and analyzed patterns across subscriber growth, content categories, estimated earnings, and geographic distribution. The analysis was structured around specific business questions rather than open-ended exploration.
I delivered a detailed analytical report and presentation that connected YouTube performance metrics with broader business implications, highlighting how content type, scale, and geography influenced growth and monetization trends. The final deliverables focused on clarity of insights rather than complex modeling.
This experience strengthened my ability to work independently with real-world datasets, structure exploratory analysis logically, and communicate insights in a clear, non-technical manner.
Business Analyst Intern · Remote
Worked on consulting-style business case studies focused on profitability improvement, feasibility analysis, and data-driven decision-making.
Role: Business Analyst Intern
Company: Analytics and finance education platform working with real-world, case-based datasets
Finlatics (operated by Fincrux Technologies LLP) runs structured business and analytics programs designed around real-world case studies. The focus is on structured thinking, feasibility analysis, and communicating insights clearly in a decision-oriented format.
I worked on multiple consulting-style business case studies, each with a clearly defined objective and structured approach. For a turnaround strategy case involving a technology-focused company, I analyzed revenue streams, cost structures, and margin drivers using Excel, identified key loss-making segments, and evaluated strategic levers such as pricing, cost rationalization, and market focus. The analysis was structured using MECE principles to ensure clarity and logical flow.
In a separate tourism feasibility case, I assessed the viability of a proposed tourism project by estimating demand, projecting revenues, and analyzing fixed and variable costs. I used Excel to build simple financial models and scenario analyses, and supported the findings with basic visual summaries in Power BI to clearly communicate trade-offs and risks.
I delivered structured case reports and presentations that clearly outlined problem statements, assumptions, analytical findings, and final recommendations. Each deliverable focused on decision-making clarity, highlighting feasibility, risks, and expected outcomes rather than exhaustive numerical detail.
This role strengthened my ability to think top-down, make assumptions explicit, and present analysis in a concise, decision-ready format aligned with how business stakeholders evaluate trade-offs.
Data Science Intern · New Delhi, India
Developed and integrated machine learning components into a client-facing SaaS platform, supporting data-driven product decisions and improved user engagement.
Role: Data Science Intern
Company: SaaS-focused software consulting firm
Invansys Technologies is a niche software consultancy specializing in outsourced development of web and mobile SaaS platforms. The firm works with clients across industries to build scalable, production-ready applications, with a focus on innovation, reliability, and solutions that perform effectively in dynamic market environments.
I worked on developing and integrating lightweight machine learning components, including sentiment analysis, within a client-facing SaaS platform. My responsibilities included data preparation, basic model training and evaluation using Python, and collaborating with engineers to ensure these components fit smoothly into the existing product architecture.
The integrated models supported improved analysis of user behavior and informed feature-level product decisions, contributing to an estimated ~13% improvement in user engagement on the client’s platform.
This experience reinforced the importance of building data and ML solutions that are reliable, maintainable, and aligned with real product requirements rather than standalone experiments.
Data Analyst Intern · Gurgaon,Delhi NCR, India
Designed a multi-level RSTO(Reverse Stock Transfer Order) operations dashboard that reduced reverse stock pendency by ~8%, enabling faster corrective action across stores and cities.
Role: Data Analyst Intern
Company: Quick-commerce platform(grocery delivered under 10 minutes), part of Eternal(Zomato)
Blinkit is a quick-commerce platform where real-time operational visibility is critical to efficiency across stores, warehouses, and last-mile delivery.
I worked closely with both the Central Operations and Store Operations teams across different phases of the internship. During my time with the Central Ops team, I focused on last-mile delivery analytics by navigating a Redshift data warehouse and writing complex SQL queries to extract store,city, and rider-level data. These queries powered dashboards and trackers requested by managers and business POCs, supporting day-to-day operational decisions across delivery performance and rider onboarding workflows.
In the Store Ops phase, I was assigned an end-to-end project to design a comprehensive dashboard for tracking Reverse Stock Transfer Orders (RSTO), representing goods sent back from stores to warehouses. This involved integrating data from multiple teams such as inventory and store operations, building robust SQL pipelines with scheduled 24-hour refresh cycles, and creating dashboards with city-level, store-level, and day-wise pendency views. These views helped assess store-level efficiency and identify locations contributing disproportionately to backlog.
To better understand the real-world context behind the data, I also conducted on-ground visits to dark stores and warehouses. This helped align analytical outputs with physical operations, ensuring that the insights reflected actual operational constraints and workflows rather than just numbers on a dashboard.
The RSTO dashboard enabled early visibility into operational bottlenecks and supported targeted corrective actions, contributing to an estimated ~8% reduction in reverse stock pendency rates. In parallel, the SQL-based trackers and dashboards built for Central Ops supported daily business decision-making by improving visibility into last-mile delivery performance, rider onboarding progression, and operational throughput, allowing teams to respond faster and more effectively.
I learned that analytics creates real impact only when insights align closely with how operations teams work and make decisions based on those in real time.
AVP, Business Development
As AVP of Business Development for the Columbia Applied Analytics Club (APAN), I primarily work on sponsorship outreach and partnerships, while also supporting the team in organizing and running club events.
My role focuses mainly on sponsorship outreach and partnership coordination, helping secure support that enables the club to host well-structured and useful events. Alongside this, I help organize initiatives such as Pizza Talks, where professors join students for informal discussions around coursework, research, and industry perspectives. I also assist the core team with coordination and logistics across other events to ensure smooth execution.
One of the key highlights was supporting APAN’s collaboration with Google to host the Applied Innovation Lab at the Google office. The event included an office tour followed by a three-hour innovation workshop, where selected teams developed and presented analytics proposals around Google Workspace and GenAI. The themes focused on user segmentation, feature prioritization, and evaluating the quality and value of GenAI-driven solutions.
Marketing | Expansion | Events
I worked on the Marketing, Expansion, and Events team at CSED, supporting outreach, promotions, and on-ground execution to increase participation and ensure entrepreneurship programs ran smoothly.
I supported marketing and coordination efforts across entrepreneurship-focused events, helping with outreach, promotions, and logistics. This included assisting with smaller sessions such as guest talks, founder interactions, and workshops, with a focus on clear communication and smooth execution for attendees.
Our flagship event was Startup Street 7.0, which was held during the Gravitas(a tech fest at VIT), I was involved in offline marketing where we promoted the event at the expo and engaged directly with potential participants. I pitched the event to interested students, which helped secure around 30 participant registrations. In addition, I contributed to planning the event timeline and coordination to ensure the overall flow of the showcase ran smoothly on the day.
1
Built an end-to-end analytics pipeline integrating NYC 311 complaints with Manhattan building datasets, analyzing 90K+ complaints across 4K+ residential buildings in four ZIP codes near Columbia University and developed a building-level livability score and an interactive Flask + Leaflet map visualization to visualize a heat map of complaints across manhattan buildings
Tech Stack: Python, SQL, PostgreSQL, Flask, Leaflet.js
2
Generated 2,000+ novel molecular structures using SMILES(Synthetic Molecular Input Line Entry System) manipulation + RDKit(a chemioinformatics library in python) and trained a ChemBERT transformer. Used cosine similarity to flag and eliminate high-severity candidates thus supporting a scalable, cost-effective in silico dengue drug discovery pipeline.
Tech Stack: Python, RDKit, Transformers (ChemBERT), Cosine Similarity
3
Achieved 98.55% test accuracy predicting Nifty 50 stocks( a benchmark index of Indian Stock Market) closing prices using an LSTM(Long Short Term Memory) model. Feature engineered technical indicators (MACD, RSI, Bollinger Bands, Garman–Klass volatility, traded value) to take into account market sentiment and volatility and applied smoothing (monthly aggregation, moving averages) to improve accuracy.
Tech Stack: Python, TensorFlow/Keras, Feature Engineering, Time-Series Modeling
This section brings together a set of research projects and a patent developed alongside my academic work. The focus has been to apply machine learning and data science techniques to practical problems across different domains.
Status: Patented
Developing effective drugs for dengue is slow and expensive,often failing late after significant time and resources have already been invested. This work presents an AI-driven framework that accelerates early-stage dengue drug discovery by generating and screening potential drug molecules entirely in silico. Starting from 14 known dengue-related drugs, the system generates over 2,000 candidate molecules using chemical structure representations (SMILES) interchanges and learns molecular patterns through a transformer-based model(ChemBERT). By comparing generated compounds with known drugs, the framework estimates potential side-effect severity and flags structurally high-risk molecules early, enabling more targeted screening while reducing reliance on premature lab testing.
Conference Paper Presented in: iCASIC (International Conference on Automation, Signal Processing, Instrumentation and Control) · Nov 2024
In busy environments such as restaurants or kiosks, shared touchscreens can slow service and raise hygiene and accessibility concerns. This work presents a touchless dining interaction system that uses computer vision–based hand gesture recognition to enable hands-free menu navigation. The system recognizes multiple finger-based gestures through image capture and preprocessing, allowing reliable gesture detection in real time. Beyond improving hygiene, the approach also supports more inclusive interactions for users with hearing or speech impairments, demonstrating how gesture-based interfaces can reduce friction in everyday service workflows.
International Journal of Professional Studies
In everyday transportation systems, factors like rain, fog, or snow often influence delays and service reliability, but these effects are rarely analyzed in a unified way. This work combines large-scale transportation data with meteorological signals to better understand how weather conditions impact public transit performance. Using real-world data from the city of Winnipeg, the study demonstrates how weather-aware features can improve transportation analytics and support more informed planning, operational decisions, and demand analysis.
International Journal of Research in Science and Technology
In real-world data science projects, the 'best' machine learning model often changes as data grows, shifts, or new constraints are introduced. This work examines how model selection can adapt over time instead of remaining a one-time decision. It proposes an adaptive approach that accounts for changing factors such as data distribution shifts, project constraints, and evolving objectives, helping teams select models that remain better aligned as conditions change. The study demonstrates how making these factors explicit can lead to more robust, explainable, and automated model selection in large-scale data science workflows.
International Journal of Transformations in Business Management
In stock markets, prices change constantly based on trends, historical patterns, and real-time events, making short-term prediction especially challenging. This work explores a hybrid machine learning approach that combines XGBoost to capture structured indicators and LSTM networks to model time-based price movements. By integrating both models into a single pipeline, the study shows how combining statistical signals with sequential patterns can improve prediction stability and performance under noisy and volatile market conditions.
International Journal of Research in Science and Technology
In real-world systems such as voice assistants or social media platforms, models often need to understand text, images, and audio at the same time. This work reviews deep multimodal learning techniques, explaining how these models learn to combine these different data types into a shared understanding. It also highlights practical challenges such as uneven data availability across modalities, difficulty in aligning noisy signals, and limitations that arise when deploying multimodal systems in real-world settings.
THIS PAGE IS UNDER CONSTRUCTION, PLEASE COME BACK LATER :(
If you’d like to collaborate or discuss an opportunity, reach out.
Phone
+1 (646) 249-3950
ag5235@columbia.edu
arnavgoenka2003@gmail.com
Address
New York, 10025, United States