Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

Interaction Styles

2 minute read

Published:

#question would you consider prompt engineeering direct manipulation?

interaction style

  • input?
  • output presented?
  • internal objects?

    cli:

    very linear? commands are imposed by the systems

Gender Bias In Coreference Resolution

6 minute read

Published:

This post will explore what coreference is, how can it be gender-biased, and some models to reduce the Biases. Following a paper

Winograd and ambiguous references

Winograd was a challenge created as a Turing test to determine how well an NLP(Natural Language processing) agent works. The task is to assign a reference to an ambiguous pronoun.

The city councilmen refused the demonstrators a permit because they feared/advocated violence.

projects

Dungeon Rooms generator

  • A reusable asset to generate dungeon rooms and minimal paths procedurally in Unity
  • Algorithms used: Delaunay triangulation, Minimal tree spanning and optimizations like grid correction
  • App can generate rooms and interconnected corridors entirely from scratch, taking up to a few minutes for each generation
GIF showing the process

RPG Level Demo

  • Illustrated simple level design and fighting combos using on state machine.
  • Used Unity to Make the game demo, complete with Sound design and dynamic lighting.
  • Played by more than 70 people, playable in browser
GIF of multi attack combo

Wizards and Knights

  • Demo Platformer game made in unity with puzzles, fighting, and platforming challenges
  • Implemented better shadow for sprites, along with buoyancy and other effects
  • Played by more than 60 people
Screencap from game

Tetris Language Compiler

  • Designed a programming language to make Tetris Game and its Variants, running on terminal
  • Implemented the compiler and grammar for the language

Quantum chess

  • Designed state diagram for the quantum part of the game
  • Implemented functions like entangle, split and measure for the quantum chess pieces (flow chart)
  • Open source Quantum Chess implementation, made with Qiskit, hichesslib, and Qt

Reddit Flair

  • Implemented a LSTM+Attention model using PyTorch and torchtext to classifies Reddit posts into its appropriate flairs
  • model achieves an F1 score of 0.55 on validation set

PokeBattle

  • Built an agent based on DQN network to play and win games against a fixed algorithm.
  • Finished the Pokemon Battle environment, implemented a more flexible algorithm, added data for various moves and Pokemons scrapped and cleaned from multiple sources.

Rainbow RL Implementation

  • Built a Deep Q network based agent to play Atari games in OpenAi gym environment using PyTorch
  • model trained for 400 episodes wins all 21 points for pong in test

Dots and Boxes

  • 2 multiplayer games; Tic‐Tac‐Toe and Dots and boxes made in unity, made using Firebase and Rest Client for Unity
  • The game connects the device to another device playing the same game

Particle Vfx

  • Reusable and editable effects: fireflies, lightning, fire
  • Made only using unity’s particle system and 2d lighting

MetroPolis

  • Game made in unity as part of TechTatva fest in MIT (Manipal), playable on WebGL
  • Game rules were regulated by the organizing committee
  • Select blocks and click to place them on grid

publications

LRG at SemEval-2021 Task 4: Improving Reading Comprehension with Abstract Words using Augmentation, Linguistic Features and Voting

Published in SemEval, 2021

In this article, we present our methodologies for SemEval-2021 Task-4: Reading Comprehension of Abstract Meaning. Given a fill-in-the-blank-type question and a corresponding context, the task is to predict the most suitable word from a list of 5 options. There are three sub-tasks within this task: Imperceptibility (subtask-I), Non-Specificity (subtask-II), and Intersection (subtask-III). We use encoders of transformers-based models pre-trained on the masked language modelling (MLM) task to build our Fill-in-the-blank (FitB) models. Moreover, to model imperceptibility, we define certain linguistic features, and to model non-specificity, we leverage information from hypernyms and hyponyms provided by a lexical database. Specifically, for non-specificity, we try out augmentation techniques, and other statistical techniques. We also propose variants, namely Chunk Voting and Max Context, to take care of input length restrictions for BERT, etc. Additionally, we perform a thorough ablation study, and use Integrated Gradients to explain our predictions on a few samples. Our best submissions achieve accuracies of 75.31% and 77.84%, on the test sets for subtask-I and subtask-II, respectively. For subtask-III, we achieve accuracies of 65.64% and 62.27%.

NLRG at SemEval-2021 Task 5: Toxic Spans Detection Leveraging BERT-based Token Classification and Span Prediction Techniques

Published in SemEval, 2021

Toxicity detection of text has been a popular NLP task in the recent years. In SemEval-2021 Task-5 Toxic Spans Detection, the focus is on detecting toxic spans within passages. Most state-of-the-art span detection approaches employ various techniques, each of which can be broadly classified into Token Classification or Span Prediction approaches. In our paper, we explore simple versions of both of these approaches and their performance on the task. Specifically, we use BERT-based models – BERT, RoBERTa, and SpanBERT for both approaches. We also combine these approaches and modify them to bring improvements for Toxic Spans prediction. To this end, we investigate results on four hybrid approaches – Multi-Span, Span+Token, LSTM-CRF, and a combination of predicted offsets using union/intersection. Additionally, we perform a thorough ablative analysis and analyze our observed results. Our best submission – a combination of SpanBERT Span Predictor and RoBERTa Token Classifier predictions – achieves an F1 score of 0.6753 on the test set. Our best post-eval F1 score is 0.6895 on intersection of predicted offsets from top-3 RoBERTa Token Classification checkpoints. These approaches improve the performance by 3% on average than those of the shared baseline models – RNNSL and SpaCy NER.

talks

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.