Arnav Goenka

I am a Data Storyteller Problem Solver Analytics Builder Coder

I’m a Master’s student at Columbia University in the heart of New York City, pursuing Applied Analytics at the intersection of data, technology, and business. Originally from Delhi, my journey has taken me across cities and continents, from moving 1,300 miles for undergrad to nearly 7,500 miles to New York for Master’s. With a background in Computer Science Engineering, I’ve naturally gravitated toward analytics and data-driven problem solving, where technology connects directly with real-world decisions.

Arnav Goenka
Sep 2025 – Dec 2026

Columbia University

Master of Science in Applied Analytics · New York City, USA · GPA 3.92/4.30

Starting my MS in Applied Analytics at Columbia University has felt like stepping into a faster gear, academically and personally. Moving from Delhi to New York came with the intensity of an Ivy League environment and the pace of a city that never really pauses. Between classes, group projects, and the constant energy around campus, I’m learning to operate at the intersection of data, technology, and business while building a routine that keeps me grounded.

The Ivy League Experience

The academic culture here pushes me to show up prepared every week. Surrounded by highly driven peers, faculty, and industry practitioners, expectations extend beyond grades to how clearly I can think, communicate, and defend your approach. The rigor and pace can feel demanding at times, but being in this environment has raised my standard for how I plan, execute, and learn.

Building analytical depth

Academically, the focus has been on developing strong, end-to-end analytical foundations. This includes understanding data infrastructure, transforming raw data into dashboards and reports, and communicating insights clearly to support decision-making. Alongside this, I’ve worked on training and evaluating machine learning models with an emphasis on practical application. More recently, I’ve been exploring how emerging technologies like blockchain are reshaping digital transactions, and how cloud platforms enable organizations to scale, deploy, and maintain modern data systems reliably.

Learning through collaboration

Group projects have played a central role in this experience, often bringing together teammates with diverse cultural backgrounds and professional experiences. Working in these settings has helped me communicate more deliberately, align expectations early, and stay adaptable when constraints evolve. It has also strengthened my confidence in presenting ideas, incorporating feedback, and delivering outcomes under tight timelines.

Life in the Big Apple

What I appreciate most about NYC is the fast paced life, the city never pauses be it heavy rain, wind or blizzard, and the diversity in the crowd, across culture, religion, and life experience. Outside the classroom, the city constantly offers opportunities through networking events, talks, student clubs, and conversations that can happen in subway trains and cafes. Beyond the professional side, NYC has also become a place to explore—trying new cuisines, discovering different neighborhoods, and building a social life in a completely new environment.

Balance within the momentum

Over time, I’ve learned to find structure within the city’s constant momentum. As I’ve started my second semester, I’m becoming more deliberate about managing my time and energy across academics, fitness, social life, and exploration. The balance isn’t perfect, but it’s evolving, and it’s helped me stay focused on building strong, real-world applied analytics skills.

Sep 2021 – Aug 2025

Vellore Institute of Technology (VIT)

Bachelor of Technology in Computer Science & Engineering · Vellore, Tamil Nadu, India · GPA: 3.96/4.00

Moving to Vellore Institute of Technology marked the first major shift in my life, both academically and personally. It was my first time living away from family, learning independence while sharing hostel spaces with students from diverse backgrounds. At the same time, the program laid a strong foundation in core computer science, shaping how I approached problem-solving, structured thinking, and disciplined execution during my undergraduate years.

Stepping outside familiar ground

Joining VIT was my first experience of stepping outside a familiar environment. Living on campus meant adapting quickly to new routines, shared spaces, and a diverse peer group. This transition helped me become more self-reliant and adaptable, while learning how to navigate change without losing focus on long-term goals.

The rigor of an engineering curriculum

The engineering program introduced a demanding academic structure defined by continuous evaluation. Regular assignments, lab work, quizzes, and examinations created a steady pace that rewarded consistency over short-term effort. This environment strengthened my discipline, time management, and ability to sustain focus across long academic cycles.

Laying the foundations of computer science

Core computer science subjects formed the technical backbone of my undergraduate experience. Through coursework and labs, I developed a strong grounding in algorithms and data structures, programming across multiple languages, database systems, and problem-solving through mathematics. Exposure to operating systems, computer networks, and basic security concepts helped me understand how software systems function end to end. Implementing solutions, debugging under constraints, and explaining my approach during evaluations helped translate theory into practical understanding.

Growth beyond the classroom

Life on campus contributed significantly to my growth outside academics. Living in hostels and working on team projects pushed me to communicate more clearly, build new friendships, and adjust to different personalities and working styles. Managing coursework alongside everyday responsibilities improved my time management and independence, while navigating shared spaces and long-term collaborations strengthened my emotional awareness and adaptability.

A gradual pull toward data-driven work

Over time, I began developing a stronger interest in working with data. Tasks that involved organizing information, analyzing patterns, and understanding outcomes felt engaging and intuitive. This growing curiosity led me to explore data-focused problems more deeply and reflect on where I wanted to take my career. Eventually, it became clear that pursuing higher education in analytics would allow me to build on my technical background while pivoting toward the data side of technology.

Work Experience

My work experience spans data science and analytics roles where the focus has always been practical impact. I’ve worked with real-world data to build pipelines, dashboards, and analyses that help teams understand what’s happening and decide what to do next.

Mar 2025 – Jun 2025

Finlatics

Data Science Intern · Remote

Analyzed a media and technology dataset based on YouTube channel performance to understand subscriber growth, content categories, earnings patterns, and geographic trends.

Role: Data Science Intern
Company: Analytics and finance education platform working with real-world, case-based datasets

About the company

Finlatics (operated by Fincrux Technologies LLP) is an analytics-focused education platform that runs structured, hands-on programs using real-world business datasets. The organization is recognized by DPIIT, a Government of India body that supports early-stage startups, and emphasizes practical, applied learning over theoretical coursework.

My role

I worked on end-to-end exploratory data analysis of a media and technology dataset centered on YouTube channels. Using Python libraries such as Pandas and NumPy, I cleaned raw data, handled missing and inconsistent values, and analyzed patterns across subscriber growth, content categories, estimated earnings, and geographic distribution. The analysis was structured around specific business questions rather than open-ended exploration.

Impact & deliverables

I delivered a detailed analytical report and presentation that connected YouTube performance metrics with broader business implications, highlighting how content type, scale, and geography influenced growth and monetization trends. The final deliverables focused on clarity of insights rather than complex modeling.

Key learnings

This experience strengthened my ability to work independently with real-world datasets, structure exploratory analysis logically, and communicate insights in a clear, non-technical manner.

Nov 2024 – Jan 2025

Finlatics

Business Analyst Intern · Remote

Worked on consulting-style business case studies focused on profitability improvement, feasibility analysis, and data-driven decision-making.

Role: Business Analyst Intern
Company: Analytics and finance education platform working with real-world, case-based datasets

About the company

Finlatics (operated by Fincrux Technologies LLP) runs structured business and analytics programs designed around real-world case studies. The focus is on structured thinking, feasibility analysis, and communicating insights clearly in a decision-oriented format.

My role

I worked on multiple consulting-style business case studies, each with a clearly defined objective and structured approach. For a turnaround strategy case involving a technology-focused company, I analyzed revenue streams, cost structures, and margin drivers using Excel, identified key loss-making segments, and evaluated strategic levers such as pricing, cost rationalization, and market focus. The analysis was structured using MECE principles to ensure clarity and logical flow.

In a separate tourism feasibility case, I assessed the viability of a proposed tourism project by estimating demand, projecting revenues, and analyzing fixed and variable costs. I used Excel to build simple financial models and scenario analyses, and supported the findings with basic visual summaries in Power BI to clearly communicate trade-offs and risks.

Impact & deliverables

I delivered structured case reports and presentations that clearly outlined problem statements, assumptions, analytical findings, and final recommendations. Each deliverable focused on decision-making clarity, highlighting feasibility, risks, and expected outcomes rather than exhaustive numerical detail.

Key learnings

This role strengthened my ability to think top-down, make assumptions explicit, and present analysis in a concise, decision-ready format aligned with how business stakeholders evaluate trade-offs.

Nov 2023 – Dec 2023

Invansys Technologies

Data Science Intern · New Delhi, India

Developed and integrated machine learning components into a client-facing SaaS platform, supporting data-driven product decisions and improved user engagement.

Role: Data Science Intern
Company: SaaS-focused software consulting firm

About the company

Invansys Technologies is a niche software consultancy specializing in outsourced development of web and mobile SaaS platforms. The firm works with clients across industries to build scalable, production-ready applications, with a focus on innovation, reliability, and solutions that perform effectively in dynamic market environments.

My role

I worked on developing and integrating lightweight machine learning components, including sentiment analysis, within a client-facing SaaS platform. My responsibilities included data preparation, basic model training and evaluation using Python, and collaborating with engineers to ensure these components fit smoothly into the existing product architecture.

Impact & deliverables

The integrated models supported improved analysis of user behavior and informed feature-level product decisions, contributing to an estimated ~13% improvement in user engagement on the client’s platform.

Key learnings

This experience reinforced the importance of building data and ML solutions that are reliable, maintainable, and aligned with real product requirements rather than standalone experiments.

Aug 2023 – Oct 2023

Blinkit

Data Analyst Intern · Gurgaon,Delhi NCR, India

Designed a multi-level RSTO(Reverse Stock Transfer Order) operations dashboard that reduced reverse stock pendency by ~8%, enabling faster corrective action across stores and cities.

Role: Data Analyst Intern
Company: Quick-commerce platform(grocery delivered under 10 minutes), part of Eternal(Zomato)

About the company

Blinkit is a quick-commerce platform where real-time operational visibility is critical to efficiency across stores, warehouses, and last-mile delivery.

My role

I worked closely with both the Central Operations and Store Operations teams across different phases of the internship. During my time with the Central Ops team, I focused on last-mile delivery analytics by navigating a Redshift data warehouse and writing complex SQL queries to extract store,city, and rider-level data. These queries powered dashboards and trackers requested by managers and business POCs, supporting day-to-day operational decisions across delivery performance and rider onboarding workflows.

In the Store Ops phase, I was assigned an end-to-end project to design a comprehensive dashboard for tracking Reverse Stock Transfer Orders (RSTO), representing goods sent back from stores to warehouses. This involved integrating data from multiple teams such as inventory and store operations, building robust SQL pipelines with scheduled 24-hour refresh cycles, and creating dashboards with city-level, store-level, and day-wise pendency views. These views helped assess store-level efficiency and identify locations contributing disproportionately to backlog.

To better understand the real-world context behind the data, I also conducted on-ground visits to dark stores and warehouses. This helped align analytical outputs with physical operations, ensuring that the insights reflected actual operational constraints and workflows rather than just numbers on a dashboard.

Impact & deliverables

The RSTO dashboard enabled early visibility into operational bottlenecks and supported targeted corrective actions, contributing to an estimated ~8% reduction in reverse stock pendency rates. In parallel, the SQL-based trackers and dashboards built for Central Ops supported daily business decision-making by improving visibility into last-mile delivery performance, rider onboarding progression, and operational throughput, allowing teams to respond faster and more effectively.

Key learnings

I learned that analytics creates real impact only when insights align closely with how operations teams work and make decisions based on those in real time.

2025 – Present

Applied Analytics Club (APAN), Columbia University

AVP, Business Development

As AVP of Business Development for the Columbia Applied Analytics Club (APAN), I primarily work on sponsorship outreach and partnerships, while also supporting the team in organizing and running club events.

My role focuses mainly on sponsorship outreach and partnership coordination, helping secure support that enables the club to host well-structured and useful events. Alongside this, I help organize initiatives such as Pizza Talks, where professors join students for informal discussions around coursework, research, and industry perspectives. I also assist the core team with coordination and logistics across other events to ensure smooth execution.

One of the key highlights was supporting APAN’s collaboration with Google to host the Applied Innovation Lab at the Google office. The event included an office tour followed by a three-hour innovation workshop, where selected teams developed and presented analytics proposals around Google Workspace and GenAI. The themes focused on user segmentation, feature prioritization, and evaluating the quality and value of GenAI-driven solutions.

2022 – 2024

Centre for Social Entrepreneurship Development (CSED),Vellore Institute of Technology

Marketing | Expansion | Events

I worked on the Marketing, Expansion, and Events team at CSED, supporting outreach, promotions, and on-ground execution to increase participation and ensure entrepreneurship programs ran smoothly.

I supported marketing and coordination efforts across entrepreneurship-focused events, helping with outreach, promotions, and logistics. This included assisting with smaller sessions such as guest talks, founder interactions, and workshops, with a focus on clear communication and smooth execution for attendees.

Our flagship event was Startup Street 7.0, which was held during the Gravitas(a tech fest at VIT), I was involved in offline marketing where we promoted the event at the expo and engaged directly with potential participants. I pitched the event to interested students, which helped secure around 30 participant registrations. In addition, I contributed to planning the event timeline and coordination to ensure the overall flow of the showcase ran smoothly on the day.

Latest Projects

1

CUStayWise — Manhattan Housing Livability Analytics

Built an end-to-end analytics pipeline integrating NYC 311 complaints with Manhattan building datasets, analyzing 90K+ complaints across 4K+ residential buildings in four ZIP codes near Columbia University and developed a building-level livability score and an interactive Flask + Leaflet map visualization to visualize a heat map of complaints across manhattan buildings

Tech Stack: Python, SQL, PostgreSQL, Flask, Leaflet.js

2

AI-Driven Dengue Drug Discovery (ChemBERT + SMILES)

Generated 2,000+ novel molecular structures using SMILES(Synthetic Molecular Input Line Entry System) manipulation + RDKit(a chemioinformatics library in python) and trained a ChemBERT transformer. Used cosine similarity to flag and eliminate high-severity candidates thus supporting a scalable, cost-effective in silico dengue drug discovery pipeline.

Tech Stack: Python, RDKit, Transformers (ChemBERT), Cosine Similarity

3

LSTM-Based Prediction Model for Nifty 50 Stocks

Achieved 98.55% test accuracy predicting Nifty 50 stocks( a benchmark index of Indian Stock Market) closing prices using an LSTM(Long Short Term Memory) model. Feature engineered technical indicators (MACD, RSI, Bollinger Bands, Garman–Klass volatility, traded value) to take into account market sentiment and volatility and applied smoothing (monthly aggregation, moving averages) to improve accuracy.

Tech Stack: Python, TensorFlow/Keras, Feature Engineering, Time-Series Modeling

My Research

This section brings together a set of research projects and a patent developed alongside my academic work. The focus has been to apply machine learning and data science techniques to practical problems across different domains.

Dec 2025

AI-driven dengue drug discovery using ChemBERT SMILES modelling and side-effect severity prediction

Status: Patented

Developing effective drugs for dengue is slow and expensive,often failing late after significant time and resources have already been invested. This work presents an AI-driven framework that accelerates early-stage dengue drug discovery by generating and screening potential drug molecules entirely in silico. Starting from 14 known dengue-related drugs, the system generates over 2,000 candidate molecules using chemical structure representations (SMILES) interchanges and learns molecular patterns through a transformer-based model(ChemBERT). By comparing generated compounds with known drugs, the framework estimates potential side-effect severity and flags structurally high-risk molecules early, enabling more targeted screening while reducing reliance on premature lab testing.

Nov 2024

HAWK.AI: Revolutionizing Dining Experiences with Touchless Gesture Recognition Technology

Conference Paper Presented in: iCASIC (International Conference on Automation, Signal Processing, Instrumentation and Control) · Nov 2024

In busy environments such as restaurants or kiosks, shared touchscreens can slow service and raise hygiene and accessibility concerns. This work presents a touchless dining interaction system that uses computer vision–based hand gesture recognition to enable hands-free menu navigation. The system recognizes multiple finger-based gestures through image capture and preprocessing, allowing reliable gesture detection in real time. Beyond improving hygiene, the approach also supports more inclusive interactions for users with hearing or speech impairments, demonstrating how gesture-based interfaces can reduce friction in everyday service workflows.

Jun 2024

Developing a Big Data Science based model linked to Meteorological data for enhanced applicability of Transportation Analytics

International Journal of Professional Studies

In everyday transportation systems, factors like rain, fog, or snow often influence delays and service reliability, but these effects are rarely analyzed in a unified way. This work combines large-scale transportation data with meteorological signals to better understand how weather conditions impact public transit performance. Using real-world data from the city of Winnipeg, the study demonstrates how weather-aware features can improve transportation analytics and support more informed planning, operational decisions, and demand analysis.

Sep 2023

Enhancing efficacy of machine learning model selection process for big data science projects by introducing an adaptive method based on dynamic factors

International Journal of Research in Science and Technology

In real-world data science projects, the 'best' machine learning model often changes as data grows, shifts, or new constraints are introduced. This work examines how model selection can adapt over time instead of remaining a one-time decision. It proposes an adaptive approach that accounts for changing factors such as data distribution shifts, project constraints, and evolving objectives, helping teams select models that remain better aligned as conditions change. The study demonstrates how making these factors explicit can lead to more robust, explainable, and automated model selection in large-scale data science workflows.

Mar 2023

Developing an integrated smart model to enhance the efficacy of Stock Market Prediction by leveraging XGBoost and Long Short-Term Memory Networks

International Journal of Transformations in Business Management

In stock markets, prices change constantly based on trends, historical patterns, and real-time events, making short-term prediction especially challenging. This work explores a hybrid machine learning approach that combines XGBoost to capture structured indicators and LSTM networks to model time-based price movements. By integrating both models into a single pipeline, the study shows how combining statistical signals with sequential patterns can improve prediction stability and performance under noisy and volatile market conditions.

Oct 2022

An in-depth analysis of the Multimodal Representation Learning with respect to the applications and linked challenges in multiple sectors

International Journal of Research in Science and Technology

In real-world systems such as voice assistants or social media platforms, models often need to understand text, images, and audio at the same time. This work reviews deep multimodal learning techniques, explaining how these models learn to combine these different data types into a shared understanding. It also highlights practical challenges such as uneven data availability across modalities, difficulty in aligning noisy signals, and limitations that arise when deploying multimodal systems in real-world settings.

Skills & Certifications

Python

SQL

R

AWS

Power BI

Tableau

HTML

CSS

JavaScript

C

C++

Java

About Me

THIS PAGE IS UNDER CONSTRUCTION, PLEASE COME BACK LATER :(

Let's Work Together

If you’d like to collaborate or discuss an opportunity, reach out.

Phone

+1 (646) 249-3950

Email

ag5235@columbia.edu

arnavgoenka2003@gmail.com

Address

New York, 10025, United States

Contact Me!