Gopal Kataria

Lo, I unveil the tale of Gopal's life, that those who snoop may find ease and not scroll endlessly like cursed souls.

Mon, Jun 23 2025 Hyderabad Edition Loading... people read this paper

UNDERGRADUATE RESEARCHER BUILDS PRODUCTION SYSTEMS AND LEADS OPEN SOURCE AT IIITH

How Gopal Kataria Debugs Life, One Commit at a Time.

Gopal Kataria - Portrait
Gopal Kataria
Photo: Professional Portrait

Hi, I am Gopal Kataria, a second-year student at IIIT Hyderabad pursuing a dual degree in Electronics and Communication Engineering, with an integrated MS by Research. I grew up in Goa, wrote my first line of code at around age ten, and have not really stopped since. These days I spend most of my time thinking about problems that sit at the boundary of math, systems, and software. Whether it is a statistical model running on a low-power sensor or a backend that needs to survive a sudden traffic spike, I am drawn to situations where correctness actually matters and cutting corners is not an option.

My current research at the Signal Processing and Communications Research Centre deals with change point detection, specifically how to detect when the underlying distribution of a data stream has shifted without making assumptions about what those distributions look like. The goal is to build methods lightweight enough to run on edge devices with no neural network in sight. On the ML side, I have been exploring a different kind of problem: can you surgically remove what a language model knows about a specific topic? I have been training Sparse Autoencoders on GPT-2 to find the features responsible for specific knowledge and then suppressing them at inference time. Early results are promising.

Outside the lab, I have spent a lot of time building things that people actually use. The ticketing system I co-built for Felicity 2026 handled over 11,000 registrations and held up through a traffic surge that was roughly fifteen times what we planned for. That kind of experience, where production is live, something breaks, and you have to fix it right now, is something no course can really replicate. I also coordinate OSDG, IIIT Hyderabad's largest technical club, and help run events at the Entrepreneurship Cell. I like being in rooms where things are being built or organized, ideally both at once.

When I am not at a laptop, I am at the gym, or hunting for good food in the city, or listening to music that most people find strange. I am genuinely excited about internships, research collaborations, or just interesting conversations. If you have something to build or a problem worth solving, feel free to reach out.

Experience

Undergraduate Researcher at SPCRC, IIIT Hyderabad

The change point detection problem sounds deceptively simple: given a stream of data, figure out when something about it fundamentally changed. In practice, it is hard. Most classical methods assume you know something about the distributions involved, which is often not true in the real world. My work at SPCRC focuses on non-parametric approaches that make no such assumptions, making them robust to the messy, unpredictable data you encounter outside of textbooks.

The specific framework I am developing uses the Probability Integral Transform to reduce the detection problem to a uniformity test. If the distribution has not changed, transformed samples should look uniform. When they stop looking uniform, something has shifted. I have been extending this to multivariate data using copulas, which lets us capture statistical dependencies across multiple dimensions without needing to model each marginal distribution separately. The end goal is a method efficient enough to run continuously on hardware with tight compute budgets.

Web Developer and Systems Admin at DFL, IIIT Hyderabad

The Division of Flexible Learning runs distance and blended programs for students across India, which means their platforms cannot afford to go down. I joined as a web developer and systems admin, responsible for deploying and maintaining the services that students depend on daily. This involved setting up Linux environments, containerizing services with Docker, and routing traffic through Nginx reverse proxies in a way that could be updated without taking anything offline.

The bigger project was building the student portal from scratch. Programs at DFL have multiple cohorts, each with their own courses, schedules, and fee structures, and students need to enroll and pay through the same interface. I designed the database schema, built the backend and frontend using SvelteKit and PostgreSQL, integrated a payments gateway, and iterated on the whole thing based on feedback from real users. It was the first time I had full ownership of a system that was genuinely depended upon, which is a different kind of pressure than a side project.

Projects

Felicity 2026: Ticketing Infrastructure

Felicity is IIIT Hyderabad's annual cultural festival, and in 2026 we decided to build our own ticketing and gate entry system rather than rely on a third-party platform. I co-built the backend in Go using the Fibre framework, with PostgreSQL handling registrations and a QR-based system managing physical entry on event day. Over 11,000 people registered, and more than 6,000 passed through the gates.

The most interesting part was not building it, it was keeping it alive. When registrations opened, traffic hit levels far beyond what we had load-tested for. The mailing pipeline buckled first. We moved from Gmail to Azure Communication Services, which helped but not enough. Eventually we ended up running a hybrid setup combining Azure with the college SMTP server to distribute the load. Getting from the system is struggling to everything is fine in a matter of hours, with real users waiting, was a crash course in production engineering that I would not trade for anything.

Visit Live Site

Targeted Unlearning of Copyrighted Entities via Sparse Autoencoders

Large language models are known to memorize significant chunks of their training data, including copyrighted material. The standard approach to address this is fine-tuning the model away from unwanted knowledge, which is expensive and tends to degrade unrelated capabilities. This project explores a more surgical alternative using mechanistic interpretability.

The idea is to train a Sparse Autoencoder on the internal activations of GPT-2 Medium, which decomposes polysemantic neurons into more interpretable features. Once you can identify which features correspond to a specific body of knowledge — in this case the Harry Potter universe — you can suppress those features at inference time without touching the model weights at all. Results show a 77% reduction in Harry Potter knowledge recall with only 0.1% WikiText-2 perplexity degradation, with suppression 4x more localized to the target domain than adjacent fantasy concepts.

View on GitHub

Stutter Detection via Prosodic Feature Extraction

Most stutter detection pipelines depend on PRAAT, a proprietary speech analysis tool, which makes them difficult to port and deploy. This project builds a fully portable prosodic feature extraction pipeline on the SEP-28k dataset using only signal processing primitives. I implemented VOP detection from scratch following Mary & Yegnanarayana (2008), along with syllable-level features: jitter, shimmer, CPP, F0 dynamics, amplitude and duration tilt, and pause features — all without touching PRAAT.

On top of the feature pipeline, I trained a CNN+LSTM classifier achieving 85.7% balanced accuracy (84.7% / 86.7% per class) and an AUC of 0.915, outperforming classical baselines by a meaningful margin. The end result is a self-contained, dependency-light pipeline that can be dropped into any Python environment.

View on GitHub

InboxPilot: Agentic Email Assistant

Built as a course project for Internals of Application Servers, InboxPilot is a demo agentic AI application that automates email triage and response drafting. The agent is built on top of a multi-step reasoning loop: it reads incoming messages, decides whether to respond, delegate, archive, or escalate, and drafts context-aware replies using a language model backend.

The project was an exercise in thinking about agentic systems not just as prompt wrappers but as stateful pipelines that need to handle failures gracefully, avoid acting on ambiguous instructions, and maintain a coherent task queue across turns.

View on GitHub

Zomabot: Food Delivery Customer Support Agent

An agentic customer support automation system for a food delivery context, built with LangGraph. Zomabot handles common support flows — order status queries, refund requests, delivery issue escalations — using a graph-based agent architecture where each node represents a distinct decision or action step.

The interesting design challenge was keeping the agent's behaviour predictable and auditable. LangGraph's explicit graph structure made it straightforward to trace exactly which path an agent took for any given conversation, which is important when you are automating decisions that have real financial consequences for customers.

View on GitHub

Urban Gov Construction Dashboard

A mockup of the Government Construction Impact Notice Board (GCINB): a ward-level, publicly accessible digital platform that aggregates information on government construction and infrastructure activity. The idea is that residents should be able to see what is being built near them, when it will affect traffic or utilities, and who to contact — all in one place, without filing RTI requests.

This was a civic tech design exercise exploring how public information systems could be structured to actually serve the public rather than bureaucratic record-keeping. The interface is built with ward-level granularity and designed to be legible to non-technical users.

View on GitHub

Linux Shell Implementation

Building a shell from scratch in C is one of those projects that forces you to actually understand what happens when you type a command. I implemented process creation using fork and exec, signal handling, pipelines, I/O redirection, and job control. The tricky parts were the edge cases: zombie processes left behind by children that exited before the parent waited on them, orphaned processes that needed to be cleaned up, and system calls that could be interrupted mid-execution and needed to be retried. The implementation passed the full course test suite and conforms to POSIX shell behavior.

View on GitHub

RISC-V RV32I Processor in Verilog

I designed and implemented a RISC-V processor supporting the full RV32I base integer instruction set in Verilog. The processor runs a pipelined execution model with hazard detection and data forwarding to handle read-after-write dependencies correctly. Working through the timing requirements of a multi-stage pipeline gave me a much more grounded understanding of how the hardware actually executes the code I write every day. The design passed all functional correctness tests against reference programs.

View on GitHub

Leadership and Roles

Coordinator, Open Source Developers Group (OSDG), IIIT Hyderabad

OSDG is IIIT Hyderabad's largest technical club, and coordinating it means thinking about what actually gets students interested in open source beyond the surface appeal of contributing to GitHub repos. I lead a team of 25, and a big part of the role has been building external relationships that make the club more credible and give students access to people and opportunities outside the campus bubble. We built partnerships with Jane Street, FOSS United, and Bhashini, each of which involved convincing an external organization that a student-run club was worth their time.

The headline event this year was HackIIIT 2026, IIIT-H's biggest intra-college hackathon, run in partnership with Jane Street. Before that, I organized Build2Break as part of IIIT-H's first ever tech fest, a hybrid hackathon and bug hunt where teams worked through the night competing on real codebases. Managing 40-plus teams across an overnight event, keeping finances in order, and making sure nothing catastrophic happened required a kind of logistics thinking that is very different from writing code.

Team Head, Events and Operations at E-Cell, IIIT Hyderabad

FAIL?25 was a speaker event built around an uncomfortable idea: that failure is worth talking about seriously. Getting five speakers to travel in from across the country, managing their travel, accommodation, and the event itself in front of over 400 people requires a level of coordination that does not leave much room for things going wrong. I led the operations for this event end to end.

Megathon is Hyderabad's largest student-run hackathon, and I was one of three Operations Heads responsible for making the whole thing work. My team of ten handled logistics and financial planning for over a thousand participants competing across three rounds for a prize pool of seven lakh rupees. On top of that, I managed the problem statement setting and judging process for the UG1 track, which meant staying across the technical side while simultaneously keeping the operational machinery running.