Sandeep Panem

Data Scientist@Zopper

I currently work as Data Scientist at Zopper

I love Data Analysis. I am passionate about Predictive analytics and interested in building statistical models to analyze real world data. I am working on areas related to Data science, Machine Learning, Text Mining, Data Mining, Social media analytics, and Natural language processing

Contact Information


Dec 2014 - Present

Data Scientist at Zopper

from Dec 2014 to Present

July 2014 - Nov 2014

Project Intern at Flipkart

from July 2014 to Nov 2014

2012 - 2014

Research Assistant @ SIEL Lab, IIIT Hyderabad.

from Dec 2012 - June 2014

Information Retrieval and Extraction Group

I am working with prof. Vasudeva Varma as part of Information Retrieval and Extraction Group at SIEL. I am currently working on CLIA consortium project funded by Department of Defence (Govt of India) - A Search engine for Indian languages. As a part of sub-team of IIIT Hyderabad, I work on query transliteration and translation module and maintaining linguistic resources for Telugu language. I mainly focus on Entity Linking, Entity Relation extraction and Sub-Topic Detection of an Entity in Social media.

2013, 2014

Teaching Assistant at IIIT Hyderabad

Cloud Computing Course - Monsoon, 2013

from Aug, 2013 to Dec, 2013

Information Retrieval and Extraction Course - Spring, 2014

from Jan, 2014 to May, 2014

Summer, 2013

Data Engineer at Imomentous Inc.

from Jun to Aug, 2013

Mapping Twitter Profiles to LinkedIn Profiles : Tracking tweets based on some keywords and mapping the profiles of respective Users to their LinkedIn Profiles.

Jan 2011 - May 2011

Project Intern at DRDO, RCI Hyderabad.

from Jan, 2011 to June, 2011


2012 - 2014

International Institute of Information Technology, Hyderabad - IIITH

from July 2012 to June 2014

MS by Research (Information Retrieval and extraction)

2007 - 2011

Sreenidhi Institute of Science and Technology (SNIST) Hyderabad

from 2007 to 2011

Bachelor of Technology in Information Technology (BTECH- CSIT)


Structured Information Extraction from Natural Disaster Events on Twitter.

Sandeep Panem, Manish Gupta, Vasudeva Varma.

Web-KR workshop (CIKM) 2014, Shanghai, China.

Linking Entities in #Microposts.

Romil Bansal, Sandeep Panem, Priya Radhakrishnan, Manish Gupta, Vasudeva Varma.

WWW 2014, Seoul, Korea.

Entity Tracking in Real-Time using Sub-Topic Detection on Twitter

Sandeep Panem, Romil Bansal, Manish Gupta and Vasudeva Varma

ECIR 2014, Amsterdam, Netherlands.

EDIUM: Improving Entity Disambiguation via User Modeling

Romil Bansal, Sandeep Panem, Manish Gupta and Vasudeva Varma

ECIR 2014, Amsterdam, Netherlands.


SPoB: Score Prediction of a ball in Cricket

We propose a Machine learning based approach which predicts the score of a ball given Cricinfo commentary text, batsman, bowler and context information the ball. We also discuss the importance of these factors in predicting the score. Our approach consists of two phases. In the first phase, we extract the important features which help in predicting the score of a ball. Then, in the second phase we build models for predicting the score of ball given different information of the ball (commentary text, batsman, bowler and context of ball).

Mapping Twitter Profiles to LinkedIn Profiles

Tracking tweets based on some keywords and mapping the profiles of respective Users to their LinkedIn Profiles.

Wiki Search

Building a Mini-Search engine for Wikipedia data of 42GB with expected response time of less than 1 second for even large queries. Response time and Ranking are the key factors of the System.

Mining Reputation through Products extraction

An important task for management of any company or organization is to track it's reputation in the market. With the advent of online media and the social web, companies are more concerned about the word of mouth spreading through the mediums like blogs and microblogs, of which Twitter is one of the most popularly used platform. we present an automated approach that monitors a stream of tweets and clusters them into different topics with respect to the given entity, ie, Company name. These clusters are ranked according to their priority. The company management can be alerted based on severity of the topics that can affect the reputation of the company.


To develop a platform on top of openstack that will install requested softwares based on user choice at the time of provisioning of virtual machine(s). Software Installation process are made simple for software stack by providing user with predefined templates.

Virtualization Orchestration Layer

An orchestration suite provides a number of services and allows the users access to these in the form of service endpoints.A cloud orchestration suite acts as a transparency layer. It combines all the resources and shows them as one(a cloud) to the users.

OpenShift Visual Workflows

A client for the OpenShift which provides an interface to the Broker-API for simple deployments as well as more complicated workflows. To develop a portal for Openshift with Drag and Drop features to select the cartridges and application platforms with respect to the business needs.

Truth Discovery of Multiple Conflicting Information Providers on Web

Truth discovery with multiple conflicting Information providers is to provide the user with the trustworthy sites which contain the accurate information about the various objects. We formulate the Veracity problem about how to discover true facts from conflicting information and we propose a framework to solve this problem, by defining the trustworthiness of websites, confidence of facts, and influences between facts.

Data Mining on Chess DataSet,Comparision of Clustering Algorithms

Performing Data Cleaning,Data Preprocessing,Clustering,Classification and Association Rule mining taks on Dataset and comparing various Clustering algorithm performances based on different types of datasets.

Sreenidhi Placement Management System

To develop a site for recruiting the eligible candidates who are already enrolled their names in the Placement Office and a completely interactive site for the students to enhance their technical and communication skills and to be easily placed in companies.

Design and Development of Firmware for On-Board Power PC based Embedded System

The project aims to design and develop firmware for initialization of the processor and handling allocation of resources and interrupt sequences of the system. It is useful in deploying applications related to Avionic systems which offers high accuracy and precision.

C-Search Utility

A document maker, syntax checker and impact analyzer for code in C.

Mentored Projects

Cassandra + Elastic Search - A scalable Data-ware house system

On demand hadoop processing with Ooozie

Create a Cassandra Browser tool for browsing namespaces and data in Cassandra using python

Twiiter with Cassandra


Certificates and Awards

Web Component Developer in Java


Attended Big Data Analytics Conference

BDA 2013


Programming Languages
Java C C++ Python
Cloud Tools
OpenStack Xen, QEMU libvirt AWS OpenShift Cloudera
Big Data Tools
Hadoop Spark Hive Mahout Lucene
Daabase Tools
MySql MongoDB Cassandra Hbase
Machine Learning Tools
Weka libsvm


Cricket Music Movies Spirituality