publications

publications by categories in reversed chronological order. generated by jekyll-scholar.

2024

arXiv

Increased LLM Vulnerabilities from Fine-tuning and Quantization

Divyanshu Kumar, Anurakt Kumar, Sahil Agarwal, and 1 more author

arXiv, Apr 2024
NeurIPS Workshop

Investigating Implicit Bias in Large Language Models: A Large-Scale Study of Over 50 LLMs

Divyanshu Kumar*, Umang Jain*, Sahil Agarwal, and 1 more author

In Neurips Safe Generative AI Workshop 2024 , Oct 2024
NeurIPS Workshop

SAGE-RT: Synthetic Alignment data Generation for Safety Evaluation and Red Teaming

Anurakt Kumar*, Divyanshu Kumar*, Jatan Loya, and 4 more authors

In Red Teaming GenAI: What Can We Learn from Adversaries? , Oct 2024
NeurIPS Workshop

Efficacy of the SAGE-RT Dataset for Model Safety Alignment: A Comparative Study

Tanay Baswa, Nitin Aravind Birur, Divyanshu Kumar, and 3 more authors

In Pluralistic Alignment Workshop at NeurIPS 2024 , Oct 2024
arXiv

VERA: Validation and Enhancement for Retrieval Augmented systems

Nitin Aravind Birur, Tanay Baswa, Divyanshu Kumar, and 3 more authors

Sep 2024