2021 Summer Internships
• Check back next year •

Internship Program Areas

Over the last few years cheap computing, novel algorithms and mountains of data have unleashed new AI-based services. The Hive is at the forefront of this trend and successfully built companies that tackle use cases involving large amounts of data. The Hive is currently working a myriad of possible use cases that could make machine learning broadly accessible to high-value business users.

Ground Truth Gold – Enterprise Data Labeling Tool

The cognitive/intelligent enterprise is the hallmark of enterprise automation; wherein both operational and strategic decision-making can be automated from artificial intelligence (AI) based extraction of high-level inferences from real-time streams of raw data. The biggest thrust to AI has come from the availability of data. A significant corpus of historical information in a specific domain will enable an AI application to extract key concepts, entity recognition, associations, and hierarchies and generate what we call smart data by merging domain knowledge with ontologies.

A common bottleneck in deploying supervised learning systems is collecting human-annotated examples. In many domains, annotators form an opinion about the label of an example incrementally. As part of the internship, we’ll be developing active learning techniques to develop semi-automatic data annotation and applied to text data and machine data. We will be looking into two verticals to test the same.

This tool is part of Euclid (The Hive internal framework to build ML/DL pipelines). Euclid is a reference pipeline to build, train, evaluate and deploy end-end Machine Learning and Deep Learning applications. The reference implementations support TensorFlow, PyTorch, regular Python-based libraries.


– Algorithm/Software Skills: Fundamental understanding of Deep Learning and Machine Learning. Experience in frameworks such as TensorFlow or PyTorch, and 3 years of experience in Python.
– Academic Background: Masters or Ph.D. candidate in Computer Science, Engineering or Physics.


