Skip to main content

Showing 1–50 of 173 results for author: Gupta, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.10239  [pdf, ps, other

    cs.CY cs.AI cs.LG cs.SE

    A Conceptual Framework for Ethical Evaluation of Machine Learning Systems

    Authors: Neha R. Gupta, Jessica Hullman, Hari Subramonyam

    Abstract: Research in Responsible AI has developed a range of principles and practices to ensure that machine learning systems are used in a manner that is ethical and aligned with human values. However, a critical yet often neglected aspect of ethical ML is the ethical implications that appear when designing evaluations of ML systems. For instance, teams may have to balance a trade-off between highly infor… ▽ More

    Submitted 4 August, 2024; originally announced August 2024.

  2. arXiv:2408.02761  [pdf, other

    cs.CV cs.LG

    Dimensionality Reduction and Nearest Neighbors for Improving Out-of-Distribution Detection in Medical Image Segmentation

    Authors: McKell Woodland, Nihil Patel, Austin Castelo, Mais Al Taie, Mohamed Eltaher, Joshua P. Yung, Tucker J. Netherton, Tiffany L. Calderone, Jessica I. Sanchez, Darrel W. Cleere, Ahmed Elsaiey, Nakul Gupta, David Victor, Laura Beretta, Ankit B. Patel Kristy K. Brock

    Abstract: Clinically deployed deep learning-based segmentation models are known to fail on data outside of their training distributions. While clinicians review the segmentations, these models tend to perform well in most instances, which could exacerbate automation bias. Therefore, detecting out-of-distribution images at inference is critical to warn the clinicians that the model likely failed. This work a… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

    Comments: Expansion of "Dimensionality Reduction for Improving Out-of-Distribution Detection in Medical Image Segmentation" arXiv:2308.03723 . Submitted to the Journal for Machine Learning in Biomedical Imaging. Code available at https://github.com/mckellwoodland/dimen_reduce_mahal

  3. arXiv:2407.16805  [pdf, other

    cs.HC cs.CY

    TAMIGO: Empowering Teaching Assistants using LLM-assisted viva and code assessment in an Advanced Computing Class

    Authors: Anishka IIITD, Diksha Sethi, Nipun Gupta, Shikhar Sharma, Srishti Jain, Ujjwal Singhal, Dhruv Kumar

    Abstract: Large Language Models (LLMs) have significantly transformed the educational landscape, offering new tools for students, instructors, and teaching assistants. This paper investigates the application of LLMs in assisting teaching assistants (TAs) with viva and code assessments in an advanced computing class on distributed systems in an Indian University. We develop TAMIGO, an LLM-based system for TA… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

    Comments: Under review

  4. arXiv:2407.13597  [pdf, other

    cs.CL cs.AI

    PLANTS: A Novel Problem and Dataset for Summarization of Planning-Like (PL) Tasks

    Authors: Vishal Pallagani, Biplav Srivastava, Nitin Gupta

    Abstract: Text summarization is a well-studied problem that deals with deriving insights from unstructured text consumed by humans, and it has found extensive business applications. However, many real-life tasks involve generating a series of actions to achieve specific goals, such as workflows, recipes, dialogs, and travel plans. We refer to them as planning-like (PL) tasks noting that the main commonality… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  5. arXiv:2407.05887  [pdf, other

    cs.CL cs.AI cs.LG

    Generation and De-Identification of Indian Clinical Discharge Summaries using LLMs

    Authors: Sanjeet Singh, Shreya Gupta, Niralee Gupta, Naimish Sharma, Lokesh Srivastava, Vibhu Agarwal, Ashutosh Modi

    Abstract: The consequences of a healthcare data breach can be devastating for the patients, providers, and payers. The average financial impact of a data breach in recent months has been estimated to be close to USD 10 million. This is especially significant for healthcare organizations in India that are managing rapid digitization while still establishing data governance procedures that align with the lett… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: Accepted at BioNLP Workshop at ACL 2024; 21 pages (9 pages main content)

  6. arXiv:2406.14670  [pdf, other

    cs.CL cs.AI cs.LG

    Exploring Design Choices for Building Language-Specific LLMs

    Authors: Atula Tejaswi, Nilesh Gupta, Eunsol Choi

    Abstract: Despite rapid progress in large language models (LLMs), their performance on a vast majority of languages remain unsatisfactory. In this paper, we study building language-specific LLMs by adapting monolingual and multilingual LLMs. We conduct systematic experiments on how design choices (base model selection, vocabulary extension, and continued fine-tuning) impact the adapted LLM, both in terms of… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 15 pages, 6 figures, 11 tables

  7. arXiv:2405.19261  [pdf, other

    cs.CL cs.AI cs.LG

    Faster Cascades via Speculative Decoding

    Authors: Harikrishna Narasimhan, Wittawat Jitkrittum, Ankit Singh Rawat, Seungyeon Kim, Neha Gupta, Aditya Krishna Menon, Sanjiv Kumar

    Abstract: Cascades and speculative decoding are two common approaches to improving language models' inference efficiency. Both approaches involve interleaving models of different sizes, but via fundamentally distinct mechanisms: cascades employ a deferral rule that invokes the larger model only for "hard" inputs, while speculative decoding uses speculative execution to primarily invoke the larger model in p… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  8. arXiv:2405.14432  [pdf, other

    cs.LG

    Boosting Robustness by Clipping Gradients in Distributed Learning

    Authors: Youssef Allouah, Rachid Guerraoui, Nirupam Gupta, Ahmed Jellouli, Geovani Rizk, John Stephan

    Abstract: Robust distributed learning consists in achieving good learning performance despite the presence of misbehaving workers. State-of-the-art (SOTA) robust distributed gradient descent (Robust-DGD) methods, relying on robust aggregation, have been proven to be optimal: Their learning error matches the lower bound established under the standard heterogeneity model of $(G, B)$-gradient dissimilarity. Th… ▽ More

    Submitted 27 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  9. arXiv:2405.00491  [pdf, ps, other

    cs.LG

    On the Relevance of Byzantine Robust Optimization Against Data Poisoning

    Authors: Sadegh Farhadkhani, Rachid Guerraoui, Nirupam Gupta, Rafael Pinot

    Abstract: The success of machine learning (ML) has been intimately linked with the availability of large amounts of data, typically collected from heterogeneous sources and processed on vast networks of computing devices (also called {\em workers}). Beyond accuracy, the use of ML in critical domains such as healthcare and autonomous driving calls for robustness against {\em data poisoning}and some {\em faul… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: 38 pages

  10. arXiv:2404.16816  [pdf, other

    cs.CL

    IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages

    Authors: Harman Singh, Nitish Gupta, Shikhar Bharadwaj, Dinesh Tewari, Partha Talukdar

    Abstract: As large language models (LLMs) see increasing adoption across the globe, it is imperative for LLMs to be representative of the linguistic diversity of the world. India is a linguistically diverse country of 1.4 Billion people. To facilitate research on multilingual LLM evaluation, we release IndicGenBench - the largest benchmark for evaluating LLMs on user-facing generation tasks across a diverse… ▽ More

    Submitted 7 August, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: ACL 2024

  11. arXiv:2404.10136  [pdf, other

    cs.CL cs.AI cs.LG

    Language Model Cascades: Token-level uncertainty and beyond

    Authors: Neha Gupta, Harikrishna Narasimhan, Wittawat Jitkrittum, Ankit Singh Rawat, Aditya Krishna Menon, Sanjiv Kumar

    Abstract: Recent advances in language models (LMs) have led to significant improvements in quality on complex NLP tasks, but at the expense of increased inference costs. Cascading offers a simple strategy to achieve more favorable cost-quality tradeoffs: here, a small model is invoked for most "easy" instances, while a few "hard" instances are deferred to the large model. While the principles underpinning c… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  12. arXiv:2404.05872  [pdf, other

    cs.CV cs.LG cs.NE

    TabConv: Low-Computation CNN Inference via Table Lookups

    Authors: Neelesh Gupta, Narayanan Kannan, Pengmiao Zhang, Viktor Prasanna

    Abstract: Convolutional Neural Networks (CNNs) have demonstrated remarkable ability throughout the field of computer vision. However, CNN inference requires a large number of arithmetic operations, making them expensive to deploy in hardware. Current approaches alleviate this issue by developing hardware-supported, algorithmic processes to simplify spatial convolution functions. However, these methods still… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: 8 pages, Accepted at CF '24

    ACM Class: I.5.1

  13. arXiv:2404.00665  [pdf, ps, other

    cs.IT

    On cumulative and relative cumulative past information generating function

    Authors: Santosh Kumar Chaudhary, Nitin Gupta, Achintya Roy

    Abstract: In this paper, we introduce the cumulative past information generating function (CPIG) and relative cumulative past information generating function (RCPIG). We study its properties. We establish its relation with generalized cumulative past entropy (GCPE). We defined CPIG stochastic order and its relation with dispersive order. We provide the results for the CPIG measure of the convoluted random v… ▽ More

    Submitted 22 April, 2024; v1 submitted 31 March, 2024; originally announced April 2024.

  14. arXiv:2403.20327  [pdf, other

    cs.CL cs.AI

    Gecko: Versatile Text Embeddings Distilled from Large Language Models

    Authors: Jinhyuk Lee, Zhuyun Dai, Xiaoqi Ren, Blair Chen, Daniel Cer, Jeremy R. Cole, Kai Hui, Michael Boratko, Rajvi Kapadia, Wen Ding, Yi Luan, Sai Meher Karthik Duddu, Gustavo Hernandez Abrego, Weiqiang Shi, Nithi Gupta, Aditya Kusupati, Prateek Jain, Siddhartha Reddy Jonnalagadda, Ming-Wei Chang, Iftekhar Naim

    Abstract: We present Gecko, a compact and versatile text embedding model. Gecko achieves strong retrieval performance by leveraging a key idea: distilling knowledge from large language models (LLMs) into a retriever. Our two-step distillation process begins with generating diverse, synthetic paired data using an LLM. Next, we further refine the data quality by retrieving a set of candidate passages for each… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: 18 pages

  15. arXiv:2403.14235  [pdf, other

    astro-ph.GA astro-ph.CO astro-ph.IM cs.CV cs.LG

    RG-CAT: Detection Pipeline and Catalogue of Radio Galaxies in the EMU Pilot Survey

    Authors: Nikhel Gupta, Ray P. Norris, Zeeshan Hayder, Minh Huynh, Lars Petersson, X. Rosalind Wang, Andrew M. Hopkins, Heinz Andernach, Yjan Gordon, Simone Riggi, Miranda Yew, Evan J. Crawford, Bärbel Koribalski, Miroslav D. Filipović, Anna D. Kapinśka, Stanislav Shabala, Tessa Vernstrom, Joshua R. Marvil

    Abstract: We present source detection and catalogue construction pipelines to build the first catalogue of radio galaxies from the 270 $\rm deg^2$ pilot survey of the Evolutionary Map of the Universe (EMU-PS) conducted with the Australian Square Kilometre Array Pathfinder (ASKAP) telescope. The detection pipeline uses Gal-DINO computer-vision networks (Gupta et al., 2024) to predict the categories of radio… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: Accepted for publication in PASA. The paper has 22 pages, 12 figures and 5 tables

  16. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1110 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 8 August, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  17. PaCKD: Pattern-Clustered Knowledge Distillation for Compressing Memory Access Prediction Models

    Authors: Neelesh Gupta, Pengmiao Zhang, Rajgopal Kannan, Viktor Prasanna

    Abstract: Deep neural networks (DNNs) have proven to be effective models for accurate Memory Access Prediction (MAP), a critical task in mitigating memory latency through data prefetching. However, existing DNN-based MAP models suffer from the challenges such as significant physical storage space and poor inference latency, primarily due to their large number of parameters. These limitations render them imp… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

    Comments: 6 pages, 2 figures, HPEC '23

    Journal ref: 2023 IEEE High Performance Extreme Computing Conference (HPEC), 2023, pp. 1-7

  18. arXiv:2402.12780  [pdf, other

    cs.LG

    Byzantine-Robust Federated Learning: Impact of Client Subsampling and Local Updates

    Authors: Youssef Allouah, Sadegh Farhadkhani, Rachid GuerraouI, Nirupam Gupta, Rafael Pinot, Geovani Rizk, Sasha Voitovych

    Abstract: The possibility of adversarial (a.k.a., {\em Byzantine}) clients makes federated learning (FL) prone to arbitrary manipulation. The natural approach to robustify FL against adversarial clients is to replace the simple averaging operation at the server in the standard $\mathsf{FedAvg}$ algorithm by a \emph{robust averaging rule}. While a significant amount of work has been devoted to studying the c… ▽ More

    Submitted 10 June, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

  19. arXiv:2402.07411  [pdf, other

    cs.LG

    Potential-Based Reward Shaping For Intrinsic Motivation

    Authors: Grant C. Forbes, Nitish Gupta, Leonardo Villalobos-Arias, Colin M. Potts, Arnav Jhala, David L. Roberts

    Abstract: Recently there has been a proliferation of intrinsic motivation (IM) reward-shaping methods to learn in complex and sparse-reward environments. These methods can often inadvertently change the set of optimal policies in an environment, leading to suboptimal behavior. Previous work on mitigating the risks of reward shaping, particularly through potential-based reward shaping (PBRS), has not been ap… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Comments: Extended version of paper appearing in AAMAS 2024

    ACM Class: I.2.6

  20. arXiv:2402.00045  [pdf, other

    cs.MM cs.AI cs.LG

    Detecting Multimedia Generated by Large AI Models: A Survey

    Authors: Li Lin, Neeraj Gupta, Yue Zhang, Hainan Ren, Chun-Hao Liu, Feng Ding, Xin Wang, Xin Li, Luisa Verdoliva, Shu Hu

    Abstract: The rapid advancement of Large AI Models (LAIMs), particularly diffusion models and large language models, has marked a new era where AI-generated multimedia is increasingly integrated into various aspects of daily life. Although beneficial in numerous fields, this content presents significant risks, including potential misuse, societal disruptions, and ethical concerns. Consequently, detecting mu… ▽ More

    Submitted 7 February, 2024; v1 submitted 22 January, 2024; originally announced February 2024.

  21. arXiv:2401.06362  [pdf, other

    cs.NE cs.AR cs.LG cs.OS

    Attention, Distillation, and Tabularization: Towards Practical Neural Network-Based Prefetching

    Authors: Pengmiao Zhang, Neelesh Gupta, Rajgopal Kannan, Viktor K. Prasanna

    Abstract: Attention-based Neural Networks (NN) have demonstrated their effectiveness in accurate memory access prediction, an essential step in data prefetching. However, the substantial computational overheads associated with these models result in high inference latency, limiting their feasibility as practical prefetchers. To close the gap, we propose a new approach based on tabularization that significan… ▽ More

    Submitted 21 February, 2024; v1 submitted 23 December, 2023; originally announced January 2024.

  22. arXiv:2401.02412  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    LLM Augmented LLMs: Expanding Capabilities through Composition

    Authors: Rachit Bansal, Bidisha Samanta, Siddharth Dalmia, Nitish Gupta, Shikhar Vashishth, Sriram Ganapathy, Abhishek Bapna, Prateek Jain, Partha Talukdar

    Abstract: Foundational models with billions of parameters which have been trained on large corpora of data have demonstrated non-trivial skills in a variety of domains. However, due to their monolithic structure, it is challenging and expensive to augment them or impart new skills. On the other hand, due to their adaptation abilities, several new instances of these models are being trained towards new domai… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

    Comments: 17 pages, 2 figures, 8 tables

  23. arXiv:2312.07343  [pdf, ps, other

    cs.HC cs.AI

    Can ChatGPT Play the Role of a Teaching Assistant in an Introductory Programming Course?

    Authors: Anishka, Atharva Mehta, Nipun Gupta, Aarav Balachandran, Dhruv Kumar, Pankaj Jalote

    Abstract: The emergence of Large language models (LLMs) is expected to have a major impact on education. This paper explores the potential of using ChatGPT, an LLM, as a virtual Teaching Assistant (TA) in an Introductory Programming Course. We evaluate ChatGPT's capabilities by comparing its performance with that of human TAs in some of the important TA functions. The TA functions which we focus on include… ▽ More

    Submitted 22 January, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: Under review

  24. arXiv:2312.06728  [pdf, other

    cs.CV astro-ph.CO astro-ph.GA astro-ph.IM

    A Multimodal Dataset and Benchmark for Radio Galaxy and Infrared Host Detection

    Authors: Nikhel Gupta, Zeeshan Hayder, Ray P. Norris, Minh Hyunh, Lars Petersson

    Abstract: We present a novel multimodal dataset developed by expert astronomers to automate the detection and localisation of multi-component extended radio galaxies and their corresponding infrared hosts. The dataset comprises 4,155 instances of galaxies in 2,800 images with both radio and infrared modalities. Each instance contains information on the extended radio galaxy class, its corresponding bounding… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: Accepted in NeurIPS 2023 conference ML4PS workshop (https://nips.cc/). The full version accepted in PASA, is available at https://doi.org/10.1017/pasa.2023.64

  25. arXiv:2312.05456  [pdf, other

    cs.LG physics.soc-ph q-bio.PE

    On the calibration of compartmental epidemiological models

    Authors: Nikunj Gupta, Anh Mai, Azza Abouzied, Dennis Shasha

    Abstract: Epidemiological compartmental models are useful for understanding infectious disease propagation and directing public health policy decisions. Calibration of these models is an important step in offering accurate forecasts of disease dynamics and the effectiveness of interventions. In this study, we present an overview of calibrating strategies that can be employed, including several optimization… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

  26. arXiv:2312.00306  [pdf, other

    astro-ph.IM astro-ph.CO astro-ph.GA cs.CV

    RadioGalaxyNET: Dataset and Novel Computer Vision Algorithms for the Detection of Extended Radio Galaxies and Infrared Hosts

    Authors: Nikhel Gupta, Zeeshan Hayder, Ray P. Norris, Minh Huynh, Lars Petersson

    Abstract: Creating radio galaxy catalogues from next-generation deep surveys requires automated identification of associated components of extended sources and their corresponding infrared hosts. In this paper, we introduce RadioGalaxyNET, a multimodal dataset, and a suite of novel computer vision algorithms designed to automate the detection and localization of multi-component extended radio galaxies and t… ▽ More

    Submitted 30 November, 2023; originally announced December 2023.

    Comments: Accepted for publication in PASA. The paper has 17 pages, 6 figures, 5 tables

  27. arXiv:2310.10636  [pdf, other

    cs.LG

    Dual-Encoders for Extreme Multi-Label Classification

    Authors: Nilesh Gupta, Devvrit Khatri, Ankit S Rawat, Srinadh Bhojanapalli, Prateek Jain, Inderjit Dhillon

    Abstract: Dual-encoder (DE) models are widely used in retrieval tasks, most commonly studied on open QA benchmarks that are often characterized by multi-class and limited training data. In contrast, their performance in multi-label and data-rich retrieval settings like extreme multi-label classification (XMC), remains under-explored. Current empirical evidence indicates that DE models fall significantly sho… ▽ More

    Submitted 17 March, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: 27 pages, 8 figures

    Journal ref: ICLR 2024 camera-ready publication

  28. arXiv:2310.08891  [pdf, other

    cs.LG cs.IR

    EHI: End-to-end Learning of Hierarchical Index for Efficient Dense Retrieval

    Authors: Ramnath Kumar, Anshul Mittal, Nilesh Gupta, Aditya Kusupati, Inderjit Dhillon, Prateek Jain

    Abstract: Dense embedding-based retrieval is now the industry standard for semantic search and ranking problems, like obtaining relevant web documents for a given query. Such techniques use a two-stage process: (a) contrastive learning to train a dual encoder to embed both the query and documents and (b) approximate nearest neighbor search (ANNS) for finding similar documents for a given query. These two st… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

  29. arXiv:2309.13591  [pdf, other

    cs.LG cs.DC math.OC

    Robust Distributed Learning: Tight Error Bounds and Breakdown Point under Data Heterogeneity

    Authors: Youssef Allouah, Rachid Guerraoui, Nirupam Gupta, Rafaël Pinot, Geovani Rizk

    Abstract: The theory underlying robust distributed learning algorithms, designed to resist adversarial machines, matches empirical observations when data is homogeneous. Under data heterogeneity however, which is the norm in practical scenarios, established lower bounds on the learning error are essentially vacuous and greatly mismatch empirical observations. This is because the heterogeneity model consider… ▽ More

    Submitted 28 October, 2023; v1 submitted 24 September, 2023; originally announced September 2023.

    Comments: Accepted to NeurIPS 2023

  30. arXiv:2309.07163  [pdf

    eess.SP cs.LG

    Systematic Review of Experimental Paradigms and Deep Neural Networks for Electroencephalography-Based Cognitive Workload Detection

    Authors: Vishnu KN, Cota Navin Gupta

    Abstract: This article summarizes a systematic review of the electroencephalography (EEG)-based cognitive workload (CWL) estimation. The focus of the article is twofold: identify the disparate experimental paradigms used for reliably eliciting discreet and quantifiable levels of cognitive load and the specific nature and representational structure of the commonly used input formulations in deep neural netwo… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

    Comments: 10 Pages, 4 figures

    MSC Class: NA ACM Class: J.3; A.1; I.2.6

  31. arXiv:2309.05270  [pdf, other

    cs.CL cs.LG

    CONFLATOR: Incorporating Switching Point based Rotatory Positional Encodings for Code-Mixed Language Modeling

    Authors: Mohsin Ali, Kandukuri Sai Teja, Neeharika Gupta, Parth Patwa, Anubhab Chatterjee, Vinija Jain, Aman Chadha, Amitava Das

    Abstract: The mixing of two or more languages is called Code-Mixing (CM). CM is a social norm in multilingual societies. Neural Language Models (NLMs) like transformers have been effective on many NLP tasks. However, NLM for CM is an under-explored area. Though transformers are capable and powerful, they cannot always encode positional information since they are non-recurrent. Therefore, to enrich word info… ▽ More

    Submitted 18 October, 2023; v1 submitted 11 September, 2023; originally announced September 2023.

    Comments: Workshop on Computational Approaches to Linguistic Code-Switching @EMNLP2023

  32. arXiv:2308.05166  [pdf, other

    astro-ph.IM astro-ph.CO astro-ph.GA cs.CV cs.LG

    Deep Learning for Morphological Identification of Extended Radio Galaxies using Weak Labels

    Authors: Nikhel Gupta, Zeeshan Hayder, Ray P. Norris, Minh Huynh, Lars Petersson, X. Rosalind Wang, Heinz Andernach, Bärbel S. Koribalski, Miranda Yew, Evan J. Crawford

    Abstract: The present work discusses the use of a weakly-supervised deep learning algorithm that reduces the cost of labelling pixel-level masks for complex radio galaxies with multiple components. The algorithm is trained on weak class-level labels of radio galaxies to get class activation maps (CAMs). The CAMs are further refined using an inter-pixel relations network (IRNet) to get instance segmentation… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

    Comments: 14 pages, 6 figues, accepted for publication in PASA

  33. arXiv:2307.03966  [pdf, other

    cs.AI cs.SE

    Multi-Intent Detection in User Provided Annotations for Programming by Examples Systems

    Authors: Nischal Ashok Kumar, Nitin Gupta, Shanmukha Guttula, Hima Patel

    Abstract: In mapping enterprise applications, data mapping remains a fundamental part of integration development, but its time consuming. An increasing number of applications lack naming standards, and nested field structures further add complexity for the integration developers. Once the mapping is done, data transformation is the next challenge for the users since each application expects data to be in a… ▽ More

    Submitted 8 July, 2023; originally announced July 2023.

  34. arXiv:2307.02764  [pdf, other

    cs.LG stat.ML

    When Does Confidence-Based Cascade Deferral Suffice?

    Authors: Wittawat Jitkrittum, Neha Gupta, Aditya Krishna Menon, Harikrishna Narasimhan, Ankit Singh Rawat, Sanjiv Kumar

    Abstract: Cascades are a classical strategy to enable inference cost to vary adaptively across samples, wherein a sequence of classifiers are invoked in turn. A deferral rule determines whether to invoke the next classifier in the sequence, or to terminate prediction. One simple deferral rule employs the confidence of the current classifier, e.g., based on the maximum predicted softmax probability. Despite… ▽ More

    Submitted 23 January, 2024; v1 submitted 6 July, 2023; originally announced July 2023.

    Comments: NeurIPS 2023

  35. Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HPX, Go, Julia, Python, Rust, Swift, and Java

    Authors: Patrick Diehl, Steven R. Brandt, Max Morris, Nikunj Gupta, Hartmut Kaiser

    Abstract: Many scientific high performance codes that simulate e.g. black holes, coastal waves, climate and weather, etc. rely on block-structured meshes and use finite differencing methods to iteratively solve the appropriate systems of differential equations. In this paper we investigate implementations of an extremely simple simulation of this type using various programming systems and languages. We focu… ▽ More

    Submitted 10 July, 2023; v1 submitted 18 May, 2023; originally announced July 2023.

  36. arXiv:2306.12100  [pdf, other

    cs.CV cs.LG

    Efficient ResNets: Residual Network Design

    Authors: Aditya Thakur, Harish Chauhan, Nikunj Gupta

    Abstract: ResNets (or Residual Networks) are one of the most commonly used models for image classification tasks. In this project, we design and train a modified ResNet model for CIFAR-10 image classification. In particular, we aimed at maximizing the test accuracy on the CIFAR-10 benchmark while keeping the size of our ResNet model under the specified fixed budget of 5 million trainable parameters. Model s… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.

  37. arXiv:2306.12094  [pdf, other

    cs.SI cs.LG

    Understanding human mobility patterns in Chicago: an analysis of taxi data using clustering techniques

    Authors: Harish Chauhan, Nikunj Gupta, Zoe Haskell-Craig

    Abstract: Understanding human mobility patterns is important in applications as diverse as urban planning, public health, and political organizing. One rich source of data on human mobility is taxi ride data. Using the city of Chicago as a case study, we examine data from taxi rides in 2016 with the goal of understanding how neighborhoods are interconnected. This analysis will provide a sense of which neigh… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.

  38. arXiv:2306.11128  [pdf, other

    cs.LG cs.MA

    CAMMARL: Conformal Action Modeling in Multi Agent Reinforcement Learning

    Authors: Nikunj Gupta, Somjit Nath, Samira Ebrahimi Kahou

    Abstract: Before taking actions in an environment with more than one intelligent agent, an autonomous agent may benefit from reasoning about the other agents and utilizing a notion of a guarantee or confidence about the behavior of the system. In this article, we propose a novel multi-agent reinforcement learning (MARL) algorithm CAMMARL, which involves modeling the actions of other agents in different situ… ▽ More

    Submitted 8 February, 2024; v1 submitted 19 June, 2023; originally announced June 2023.

  39. arXiv:2306.01689  [pdf

    eess.IV cs.CV q-bio.NC

    Unique Brain Network Identification Number for Parkinson's Individuals Using Structural MRI

    Authors: Tanmayee Samantaray, Utsav Gupta, Jitender Saini, Cota Navin Gupta

    Abstract: We propose a novel algorithm called Unique Brain Network Identification Number, UBNIN for encoding the brain networks of individual subjects. To realize this objective, we employed structural MRI on 180 Parkinsons disease PD patients and 70 healthy controls HC from the National Institute of Mental Health and Neurosciences, India. We parcellated each subjects brain volume and constructed an individ… ▽ More

    Submitted 19 September, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

    Comments: 15 pages, 5 figures,1 algorithm, 1 main table, 1 appendix table

    Journal ref: Brain Sciences, vol. 13, no. 9, 08 Sep. 2023

  40. XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages

    Authors: Sebastian Ruder, Jonathan H. Clark, Alexander Gutkin, Mihir Kale, Min Ma, Massimo Nicosia, Shruti Rijhwani, Parker Riley, Jean-Michel A. Sarr, Xinyi Wang, John Wieting, Nitish Gupta, Anna Katanova, Christo Kirov, Dana L. Dickinson, Brian Roark, Bidisha Samanta, Connie Tao, David I. Adelani, Vera Axelrod, Isaac Caswell, Colin Cherry, Dan Garrette, Reeve Ingle, Melvin Johnson , et al. (2 additional authors not shown)

    Abstract: Data scarcity is a crucial issue for the development of highly multilingual NLP systems. Yet for many under-represented languages (ULs) -- languages for which NLP re-search is particularly far behind in meeting user needs -- it is feasible to annotate small amounts of data. Motivated by this, we propose XTREME-UP, a benchmark defined by: its focus on the scarce-data scenario rather than zero-shot;… ▽ More

    Submitted 24 May, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

  41. arXiv:2305.01655  [pdf, other

    cs.LG stat.ME

    Predicting blood pressure under circumstances of missing data: An analysis of missing data patterns and imputation methods using NHANES

    Authors: Harish Chauhan, Nikunj Gupta, Zoe Haskell-Craig

    Abstract: The World Health Organization defines cardio-vascular disease (CVD) as "a group of disorders of the heart and blood vessels," including coronary heart disease and stroke (WHO 21). CVD is affected by "intermediate risk factors" such as raised blood pressure, raised blood glucose, raised blood lipids, and obesity. These are predominantly influenced by lifestyle and behaviour, including physical inac… ▽ More

    Submitted 1 May, 2023; originally announced May 2023.

  42. arXiv:2305.01471  [pdf, other

    cs.DS

    FPT Approximations for Capacitated/Fair Clustering with Outliers

    Authors: Rajni Dabas, Neelima Gupta, Tanmay Inamdar

    Abstract: Clustering problems such as $k$-Median, and $k$-Means, are motivated from applications such as location planning, unsupervised learning among others. In such applications, it is important to find the clustering of points that is not ``skewed'' in terms of the number of points, i.e., no cluster should contain too many points. This is modeled by capacity constraints on the sizes of clusters. In an o… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

    Comments: Abstract shortened to meet arxiv requirements

  43. arXiv:2304.02925  [pdf

    eess.IV cs.CV

    Computer-aided Diagnosis of Malaria through Transfer Learning using the ResNet50 Backbone

    Authors: Sanya Sinha, Nilay Gupta

    Abstract: According to the World Malaria Report of 2022, 247 million cases of malaria and 619,000 related deaths were reported in 2021. This highlights the predominance of the disease, especially in the tropical and sub-tropical regions of Africa, parts of South-east Asia, Central and Southern America. Malaria is caused due to the Plasmodium parasite which is circulated through the bites of the female Anoph… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

    ACM Class: I.4.9

  44. arXiv:2302.04787  [pdf, other

    cs.LG cs.CR cs.DC

    On the Privacy-Robustness-Utility Trilemma in Distributed Learning

    Authors: Youssef Allouah, Rachid Guerraoui, Nirupam Gupta, Rafael Pinot, John Stephan

    Abstract: The ubiquity of distributed machine learning (ML) in sensitive public domain applications calls for algorithms that protect data privacy, while being robust to faults and adversarial behaviors. Although privacy and robustness have been extensively studied independently in distributed ML, their synthesis remains poorly understood. We present the first tight analysis of the error incurred by any alg… ▽ More

    Submitted 29 May, 2023; v1 submitted 9 February, 2023; originally announced February 2023.

    Comments: Accepted paper at ICML

  45. arXiv:2302.01772  [pdf, other

    cs.LG cs.DC

    Fixing by Mixing: A Recipe for Optimal Byzantine ML under Heterogeneity

    Authors: Youssef Allouah, Sadegh Farhadkhani, Rachid Guerraoui, Nirupam Gupta, Rafael Pinot, John Stephan

    Abstract: Byzantine machine learning (ML) aims to ensure the resilience of distributed learning algorithms to misbehaving (or Byzantine) machines. Although this problem received significant attention, prior works often assume the data held by the machines to be homogeneous, which is seldom true in practical settings. Data heterogeneity makes Byzantine ML considerably more challenging, since a Byzantine mach… ▽ More

    Submitted 3 February, 2023; originally announced February 2023.

    Comments: Accepted paper at AISTATS 2023

  46. arXiv:2302.00545  [pdf, other

    cs.CV q-bio.NC

    An Out-of-Domain Synapse Detection Challenge for Microwasp Brain Connectomes

    Authors: Jingpeng Wu, Yicong Li, Nishika Gupta, Kazunori Shinomiya, Pat Gunn, Alexey Polilov, Hanspeter Pfister, Dmitri Chklovskii, Donglai Wei

    Abstract: The size of image stacks in connectomics studies now reaches the terabyte and often petabyte scales with a great diversity of appearance across brain regions and samples. However, manual annotation of neural structures, e.g., synapses, is time-consuming, which leads to limited training data often smaller than 0.001\% of the test data in size. Domain adaptation and generalization approaches were pr… ▽ More

    Submitted 1 February, 2023; originally announced February 2023.

  47. arXiv:2301.12802  [pdf, other

    cs.LG

    Planning Multiple Epidemic Interventions with Reinforcement Learning

    Authors: Anh Mai, Nikunj Gupta, Azza Abouzied, Dennis Shasha

    Abstract: Combating an epidemic entails finding a plan that describes when and how to apply different interventions, such as mask-wearing mandates, vaccinations, school or workplace closures. An optimal plan will curb an epidemic with minimal loss of life, disease burden, and economic cost. Finding an optimal plan is an intractable computational problem in realistic settings. Policy-makers, however, would g… ▽ More

    Submitted 7 June, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

  48. arXiv:2301.10336  [pdf, other

    cs.CR

    A survey of Digital Manufacturing Hardware and Software Trojans

    Authors: Prithwish Basu Roy, Mudit Bhargava, Chia-Yun Chang, Ellen Hui, Nikhil Gupta, Ramesh Karri, Hammond Pearce

    Abstract: Digital Manufacturing (DM) refers to the on-going adoption of smarter, more agile manufacturing processes and cyber-physical systems. This includes modern techniques and technologies such as Additive Manufacturing (AM)/3D printing, as well as the Industrial Internet of Things (IIoT) and the broader trend toward Industry 4.0. However, this adoption is not without risks: with a growing complexity an… ▽ More

    Submitted 24 January, 2023; originally announced January 2023.

    Comments: 15 pages

  49. arXiv:2301.02946  [pdf

    cs.HC cs.SI physics.soc-ph

    Patterns of Social Vulnerability -- An Interactive Dashboard to Explore Risks to Public Health on the US County Level

    Authors: Darius Coelho, Nikita Gupta, Eric Papenhausen, Klaus Mueller

    Abstract: Social vulnerability is the susceptibility of a community to be adversely impacted by natural hazards and public health emergencies, such as drought, earthquakes, flooding, virus outbreaks, and the like. Climate change is at the root of many recent natural hazards while the COVID-19 pandemic is still an active threat. Social vulnerability also refers to resilience, or the ability to recover from s… ▽ More

    Submitted 7 January, 2023; originally announced January 2023.

  50. arXiv:2301.00659  [pdf, ps, other

    cs.IT math.CA math.ST

    On partial monotonicity of some extropy measures

    Authors: Nitin Gupta, Santosh Kumar Chaudhary

    Abstract: Gupta and Chaudhary [14] introduced general weighted extropy and studied related properties. In this paper, we study conditional extropy and define the monotonic behaviour of conditional extropy. Also, we obtain results on the convolution of general weighted extropy.

    Submitted 29 November, 2022; originally announced January 2023.

    MSC Class: 94A17; 62N05; 60E15