publications

publications by categories in reversed chronological order. generated by jekyll-scholar.

2024

  1. arXiv
    Increased LLM Vulnerabilities from Fine-tuning and Quantization
    Divyanshu Kumar, Anurakt Kumar, Sahil Agarwal, and 1 more author
    arXiv, Apr 2024
  2. NeurIPS Workshop
    Investigating Implicit Bias in Large Language Models: A Large-Scale Study of Over 50 LLMs
    Divyanshu Kumar*, Umang Jain*, Sahil Agarwal, and 1 more author
    In Neurips Safe Generative AI Workshop 2024 , Oct 2024
  3. NeurIPS Workshop
    SAGE-RT: Synthetic Alignment data Generation for Safety Evaluation and Red Teaming
    Anurakt Kumar*, Divyanshu Kumar*, Jatan Loya, and 4 more authors
    In Red Teaming GenAI: What Can We Learn from Adversaries? , Oct 2024
  4. NeurIPS Workshop
    Efficacy of the SAGE-RT Dataset for Model Safety Alignment: A Comparative Study
    Tanay Baswa, Nitin Aravind Birur, Divyanshu Kumar, and 3 more authors
    In Pluralistic Alignment Workshop at NeurIPS 2024 , Oct 2024
  5. arXiv
    VERA: Validation and Enhancement for Retrieval Augmented systems
    Nitin Aravind Birur, Tanay Baswa, Divyanshu Kumar, and 3 more authors
    Sep 2024