Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Page Not Found

Page not found. Your pixels are in another canvas.

Jupyter notebook markdown generator

Posts

Interaction Styles

2 minute read

Published: September 23, 2024

#question would you consider prompt engineeering direct manipulation?

interaction style

input?
output presented?
internal objects?
cli:
very linear? commands are imposed by the systems

Gender Bias In Coreference Resolution

6 minute read

Published: March 31, 2020

This post will explore what coreference is, how can it be gender-biased, and some models to reduce the Biases. Following a paper

Winograd and ambiguous references

Winograd was a challenge created as a Turing test to determine how well an NLP(Natural Language processing) agent works. The task is to assign a reference to an ambiguous pronoun.

The city councilmen refused the demonstrators a permit because they feared/advocated violence.

projects

Dungeon Rooms generator

A reusable asset to generate dungeon rooms and minimal paths procedurally in Unity
Algorithms used: Delaunay triangulation, Minimal tree spanning and optimizations like grid correction
App can generate rooms and interconnected corridors entirely from scratch, taking up to a few minutes for each generation

GIF showing the process

RPG Level Demo

Illustrated simple level design and fighting combos using on state machine.
Used Unity to Make the game demo, complete with Sound design and dynamic lighting.
Played by more than 70 people, playable in browser

GIF of multi attack combo

Wizards and Knights

Demo Platformer game made in unity with puzzles, fighting, and platforming challenges
Implemented better shadow for sprites, along with buoyancy and other effects
Played by more than 60 people

Screencap from game

Tetris Language Compiler

Designed a programming language to make Tetris Game and its Variants, running on terminal
Implemented the compiler and grammar for the language

Quantum chess

Designed state diagram for the quantum part of the game
Implemented functions like entangle, split and measure for the quantum chess pieces (flow chart)
Open source Quantum Chess implementation, made with Qiskit, hichesslib, and Qt

Reddit Flair

Implemented a LSTM+Attention model using PyTorch and torchtext to classifies Reddit posts into its appropriate flairs
model achieves an F1 score of 0.55 on validation set

PokeBattle

Built an agent based on DQN network to play and win games against a fixed algorithm.
Finished the Pokemon Battle environment, implemented a more flexible algorithm, added data for various moves and Pokemons scrapped and cleaned from multiple sources.

Rainbow RL Implementation

Built a Deep Q network based agent to play Atari games in OpenAi gym environment using PyTorch
model trained for 400 episodes wins all 21 points for pong in test

Dots and Boxes

2 multiplayer games; Tic‐Tac‐Toe and Dots and boxes made in unity, made using Firebase and Rest Client for Unity
The game connects the device to another device playing the same game

Particle Vfx

Reusable and editable effects: fireflies, lightning, fire
Made only using unity’s particle system and 2d lighting

MetroPolis

Game made in unity as part of TechTatva fest in MIT (Manipal), playable on WebGL
Game rules were regulated by the organizing committee
Select blocks and click to place them on grid

publications

LRG at SemEval-2021 Task 4: Improving Reading Comprehension with Abstract Words using Augmentation, Linguistic Features and Voting

Published in SemEval, 2021

In this article, we present our methodologies for SemEval-2021 Task-4: Reading Comprehension of Abstract Meaning. Given a fill-in-the-blank-type question and a corresponding context, the task is to predict the most suitable word from a list of 5 options. There are three sub-tasks within this task: Imperceptibility (subtask-I), Non-Specificity (subtask-II), and Intersection (subtask-III). We use encoders of transformers-based models pre-trained on the masked language modelling (MLM) task to build our Fill-in-the-blank (FitB) models. Moreover, to model imperceptibility, we define certain linguistic features, and to model non-specificity, we leverage information from hypernyms and hyponyms provided by a lexical database. Specifically, for non-specificity, we try out augmentation techniques, and other statistical techniques. We also propose variants, namely Chunk Voting and Max Context, to take care of input length restrictions for BERT, etc. Additionally, we perform a thorough ablation study, and use Integrated Gradients to explain our predictions on a few samples. Our best submissions achieve accuracies of 75.31% and 77.84%, on the test sets for subtask-I and subtask-II, respectively. For subtask-III, we achieve accuracies of 65.64% and 62.27%.

NLRG at SemEval-2021 Task 5: Toxic Spans Detection Leveraging BERT-based Token Classification and Span Prediction Techniques

Published in SemEval, 2021

Toxicity detection of text has been a popular NLP task in the recent years. In SemEval-2021 Task-5 Toxic Spans Detection, the focus is on detecting toxic spans within passages. Most state-of-the-art span detection approaches employ various techniques, each of which can be broadly classified into Token Classification or Span Prediction approaches. In our paper, we explore simple versions of both of these approaches and their performance on the task. Specifically, we use BERT-based models – BERT, RoBERTa, and SpanBERT for both approaches. We also combine these approaches and modify them to bring improvements for Toxic Spans prediction. To this end, we investigate results on four hybrid approaches – Multi-Span, Span+Token, LSTM-CRF, and a combination of predicted offsets using union/intersection. Additionally, we perform a thorough ablative analysis and analyze our observed results. Our best submission – a combination of SpanBERT Span Predictor and RoBERTa Token Classifier predictions – achieves an F1 score of 0.6753 on the test set. Our best post-eval F1 score is 0.6895 on intersection of predicted offsets from top-3 RoBERTa Token Classification checkpoints. These approaches improve the performance by 3% on average than those of the shared baseline models – RNNSL and SpaCy NER.

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.

Yash Bhartia