Skip to main content

Showing 1–42 of 42 results for author: Munawar, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2410.05095  [pdf, other

    cs.RO cs.GR cs.SE

    Towards a Modern and Lightweight Rendering Engine for Dynamic Robotic Simulations

    Authors: Christopher John Allison, Haoying Zhou, Adnan Munawar, Peter Kazanzides, Juan Antonio Barragan

    Abstract: Interactive dynamic simulators are an accelerator for developing novel robotic control algorithms and complex systems involving humans and robots. In user training and synthetic data generation applications, a high-fidelity visualization of the simulation is essential. Visual fidelity is dependent on the quality of the computer graphics algorithms used to render the simulated scene. Furthermore, t… ▽ More

    Submitted 7 October, 2024; originally announced October 2024.

    Comments: 8 pages, 8 figures, submitted to the 2024 IEEE International Conference on Robotic Computing (IRC)

  2. arXiv:2409.03797  [pdf, other

    cs.AI cs.CL

    NESTFUL: A Benchmark for Evaluating LLMs on Nested Sequences of API Calls

    Authors: Kinjal Basu, Ibrahim Abdelaziz, Kelsey Bradford, Maxwell Crouse, Kiran Kate, Sadhana Kumaravel, Saurabh Goyal, Asim Munawar, Yara Rizk, Xin Wang, Luis Lastras, Pavan Kapanipathi

    Abstract: Autonomous agent applications powered by large language models (LLMs) have recently risen to prominence as effective tools for addressing complex real-world tasks. At their core, agentic workflows rely on LLMs to plan and execute the use of tools and external Application Programming Interfaces (APIs) in sequence to arrive at the answer to a user's request. Various benchmarks and leaderboards have… ▽ More

    Submitted 4 September, 2024; originally announced September 2024.

  3. arXiv:2407.00121  [pdf, other

    cs.LG cs.AI cs.CL

    Granite-Function Calling Model: Introducing Function Calling Abilities via Multi-task Learning of Granular Tasks

    Authors: Ibrahim Abdelaziz, Kinjal Basu, Mayank Agarwal, Sadhana Kumaravel, Matthew Stallone, Rameswar Panda, Yara Rizk, GP Bhargav, Maxwell Crouse, Chulaka Gunasekara, Shajith Ikbal, Sachin Joshi, Hima Karanam, Vineet Kumar, Asim Munawar, Sumit Neelam, Dinesh Raghu, Udit Sharma, Adriana Meza Soria, Dheeraj Sreedhar, Praveen Venkateswaran, Merve Unuvar, David Cox, Salim Roukos, Luis Lastras , et al. (1 additional authors not shown)

    Abstract: Large language models (LLMs) have recently shown tremendous promise in serving as the backbone to agentic systems, as demonstrated by their performance in multi-faceted, challenging benchmarks like SWE-Bench and Agent-Bench. However, to realize the true potential of LLMs as autonomous agents, they must learn to identify, call, and interact with external tools and application program interfaces (AP… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

  4. arXiv:2406.13865  [pdf, other

    cs.RO

    SurgicAI: A Fine-grained Platform for Data Collection and Benchmarking in Surgical Policy Learning

    Authors: Jin Wu, Haoying Zhou, Peter Kazanzides, Adnan Munawar, Anqi Liu

    Abstract: Despite advancements in robotic-assisted surgery, automating complex tasks like suturing remain challenging due to the need for adaptability and precision. Learning-based approaches, particularly reinforcement learning (RL) and imitation learning (IL), require realistic simulation environments for efficient data collection. However, current platforms often include only relatively simple, non-dexte… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  5. arXiv:2406.07375  [pdf, other

    cs.RO cs.LG

    Improving the realism of robotic surgery simulation through injection of learning-based estimated errors

    Authors: Juan Antonio Barragan, Hisashi Ishida, Adnan Munawar, Peter Kazanzides

    Abstract: The development of algorithms for automation of subtasks during robotic surgery can be accelerated by the availability of realistic simulation environments. In this work, we focus on one aspect of the realism of a surgical simulator, which is the positional accuracy of the robot. In current simulators, robots have perfect or near-perfect accuracy, which is not representative of their physical coun… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 6 page paper

    Journal ref: 2024 International Symposium on Medical Robotics (ISMR)

  6. arXiv:2406.07328  [pdf, other

    cs.RO cs.LG

    Realistic Data Generation for 6D Pose Estimation of Surgical Instruments

    Authors: Juan Antonio Barragan, Jintan Zhang, Haoying Zhou, Adnan Munawar, Peter Kazanzides

    Abstract: Automation in surgical robotics has the potential to improve patient safety and surgical efficiency, but it is difficult to achieve due to the need for robust perception algorithms. In particular, 6D pose estimation of surgical instruments is critical to enable the automatic execution of surgical maneuvers based on visual feedback. In recent years, supervised deep learning algorithms have shown in… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 6 pages

    Journal ref: 2024 IEEE International Conference on Robotics and Automation (ICRA)

  7. arXiv:2405.04324  [pdf, other

    cs.AI cs.CL cs.SE

    Granite Code Models: A Family of Open Foundation Models for Code Intelligence

    Authors: Mayank Mishra, Matt Stallone, Gaoyuan Zhang, Yikang Shen, Aditya Prasad, Adriana Meza Soria, Michele Merler, Parameswaran Selvam, Saptha Surendran, Shivdeep Singh, Manish Sethi, Xuan-Hong Dang, Pengyuan Li, Kun-Lung Wu, Syed Zawad, Andrew Coleman, Matthew White, Mark Lewis, Raju Pavuluri, Yan Koyfman, Boris Lublinsky, Maximilien de Bayser, Ibrahim Abdelaziz, Kinjal Basu, Mayank Agarwal , et al. (21 additional authors not shown)

    Abstract: Large Language Models (LLMs) trained on code are revolutionizing the software development process. Increasingly, code LLMs are being integrated into software development environments to improve the productivity of human programmers, and LLM-based agents are beginning to show promise for handling complex tasks autonomously. Realizing the full potential of code LLMs requires a wide range of capabili… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Corresponding Authors: Rameswar Panda, Ruchir Puri; Equal Contributors: Mayank Mishra, Matt Stallone, Gaoyuan Zhang

  8. arXiv:2403.00827  [pdf, other

    cs.CL cs.AI cs.LG

    Self-Refinement of Language Models from External Proxy Metrics Feedback

    Authors: Keshav Ramji, Young-Suk Lee, Ramón Fernandez Astudillo, Md Arafat Sultan, Tahira Naseem, Asim Munawar, Radu Florian, Salim Roukos

    Abstract: It is often desirable for Large Language Models (LLMs) to capture multiple objectives when providing a response. In document-grounded response generation, for example, agent responses are expected to be relevant to a user's query while also being grounded in a given document. In this paper, we introduce Proxy Metric-based Self-Refinement (ProMiSe), which enables an LLM to refine its own initial re… ▽ More

    Submitted 27 February, 2024; originally announced March 2024.

  9. arXiv:2402.15491  [pdf, other

    cs.CL cs.AI

    API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs

    Authors: Kinjal Basu, Ibrahim Abdelaziz, Subhajit Chaudhury, Soham Dan, Maxwell Crouse, Asim Munawar, Sadhana Kumaravel, Vinod Muthusamy, Pavan Kapanipathi, Luis A. Lastras

    Abstract: There is a growing need for Large Language Models (LLMs) to effectively use tools and external Application Programming Interfaces (APIs) to plan and complete tasks. As such, there is tremendous interest in methods that can acquire sufficient quantities of train and test data that involve calls to tools / APIs. Two lines of research have emerged as the predominant strategies for addressing this cha… ▽ More

    Submitted 20 May, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

    Comments: Accepted at ACL'24-main conference

  10. arXiv:2402.02479  [pdf, other

    cs.LG cs.AI cs.CL cs.HC

    BRAIn: Bayesian Reward-conditioned Amortized Inference for natural language generation from feedback

    Authors: Gaurav Pandey, Yatin Nandwani, Tahira Naseem, Mayank Mishra, Guangxuan Xu, Dinesh Raghu, Sachindra Joshi, Asim Munawar, Ramón Fernandez Astudillo

    Abstract: Distribution matching methods for language model alignment such as Generation with Distributional Control (GDC) and Distributional Policy Gradient (DPG) have not received the same level of attention in reinforcement learning from human feedback (RLHF) as contrastive methods such as Sequence Likelihood Calibration (SLiC), Direct Preference Optimization (DPO) and its variants. We identify high varia… ▽ More

    Submitted 10 June, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

    Comments: Accepted at ICML 2024 (main conference)

  11. arXiv:2401.11715  [pdf, other

    cs.RO eess.SY

    Integrating 3D Slicer with a Dynamic Simulator for Situational Aware Robotic Interventions

    Authors: Manish Sahu, Hisashi Ishida, Laura Connolly, Hongyi Fan, Anton Deguet, Peter Kazanzides, Francis X. Creighton, Russell H. Taylor, Adnan Munawar

    Abstract: Image-guided robotic interventions represent a transformative frontier in surgery, blending advanced imaging and robotics for improved precision and outcomes. This paper addresses the critical need for integrating open-source platforms to enhance situational awareness in image-guided robotic research. We present an open-source toolset that seamlessly combines a physics-based constraint formulation… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: *These authors contributed equally

  12. arXiv:2401.11709  [pdf, other

    cs.RO eess.SY

    Haptic-Assisted Collaborative Robot Framework for Improved Situational Awareness in Skull Base Surgery

    Authors: Hisashi Ishida, Manish Sahu, Adnan Munawar, Nimesh Nagururu, Deepa Galaiya, Peter Kazanzides, Francis X. Creighton, Russell H. Taylor

    Abstract: Skull base surgery is a demanding field in which surgeons operate in and around the skull while avoiding critical anatomical structures including nerves and vasculature. While image-guided surgical navigation is the prevailing standard, limitation still exists requiring personalized planning and recognizing the irreplaceable role of a skilled surgeon. This paper presents a collaboratively controll… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: *These authors contributed equally

  13. arXiv:2310.13961  [pdf, other

    cs.CL cs.AI

    Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs

    Authors: Young-Suk Lee, Md Arafat Sultan, Yousef El-Kurdi, Tahira Naseem Asim Munawar, Radu Florian, Salim Roukos, Ramón Fernandez Astudillo

    Abstract: Using in-context learning (ICL) for data generation, techniques such as Self-Instruct (Wang et al., 2023) or the follow-up Alpaca (Taori et al., 2023) can train strong conversational agents with only a small amount of human supervision. One limitation of these approaches is that they resort to very large language models (around 175B parameters) that are also proprietary and non-public. Here we exp… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

    Journal ref: EMNLP 2023

  14. arXiv:2308.16688  [pdf

    cs.CL cs.AI

    Using Large Language Models to Automate Category and Trend Analysis of Scientific Articles: An Application in Ophthalmology

    Authors: Hina Raja, Asim Munawar, Mohammad Delsoz, Mohammad Elahi, Yeganeh Madadi, Amr Hassan, Hashem Abu Serhan, Onur Inam, Luis Hermandez, Sang Tran, Wuqas Munir, Alaa Abd-Alrazaq, Hao Chen, SiamakYousefi

    Abstract: Purpose: In this paper, we present an automated method for article classification, leveraging the power of Large Language Models (LLM). The primary focus is on the field of ophthalmology, but the model is extendable to other fields. Methods: We have developed a model based on Natural Language Processing (NLP) techniques, including advanced LLMs, to process and analyze the textual content of scient… ▽ More

    Submitted 31 August, 2023; originally announced August 2023.

  15. arXiv:2307.02689  [pdf, other

    cs.CL

    Learning Symbolic Rules over Abstract Meaning Representations for Textual Reinforcement Learning

    Authors: Subhajit Chaudhury, Sarathkrishna Swaminathan, Daiki Kimura, Prithviraj Sen, Keerthiram Murugesan, Rosario Uceda-Sosa, Michiaki Tatsubori, Achille Fokoue, Pavan Kapanipathi, Asim Munawar, Alexander Gray

    Abstract: Text-based reinforcement learning agents have predominantly been neural network-based models with embeddings-based representation, learning uninterpretable policies that often do not generalize well to unseen games. On the other hand, neuro-symbolic methods, specifically those that leverage an intermediate formal representation, are gaining significant attention in language understanding tasks. Th… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

    Comments: ACL 2023

  16. arXiv:2303.05704  [pdf, other

    cs.RO

    A Data-Driven Model with Hysteresis Compensation for I2RIS Robot

    Authors: Mojtaba Esfandiari, Yanlin Zhou, Shervin Dehghani, Muhammad Hadi, Adnan Munawar, Henry Phalen, Peter Gehlbach, Russell H. Taylor, Iulian Iordachita

    Abstract: Retinal microsurgery is a high-precision surgery performed on an exceedingly delicate tissue. It now requires extensively trained and highly skilled surgeons. Given the restricted range of instrument motion in the confined intraocular space, and also potentially restricting instrument contact with the sclera, snake-like robots may prove to be a promising technology to provide surgeons with greater… ▽ More

    Submitted 10 March, 2023; originally announced March 2023.

  17. arXiv:2303.01733  [pdf, other

    cs.HC cs.RO

    Improving Surgical Situational Awareness with Signed Distance Field: A Pilot Study in Virtual Reality

    Authors: Hisashi Ishida, Juan Antonio Barragan, Adnan Munawar, Zhaoshuo Li, Andy Ding, Peter Kazanzides, Danielle Trakimas, Francis X. Creighton, Russell H. Taylor

    Abstract: The introduction of image-guided surgical navigation (IGSN) has greatly benefited technically demanding surgical procedures by providing real-time support and guidance to the surgeon during surgery. \hi{To develop effective IGSN, a careful selection of the surgical information and the medium to present this information to the surgeon is needed. However, this is not a trivial task due to the broad… ▽ More

    Submitted 1 August, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

    Comments: First two authors contributed equally. 6 pages

    Journal ref: International Conference on Intelligent Robots and Systems (IROS) 2023

  18. arXiv:2302.13878  [pdf, other

    cs.RO cs.HC

    Fully Immersive Virtual Reality for Skull-base Surgery: Surgical Training and Beyond

    Authors: Adnan Munawar, Zhaoshuo Li, Nimesh Nagururu, Danielle Trakimas, Peter Kazanzides, Russell H. Taylor, Francis X. Creighton

    Abstract: Purpose: A virtual reality (VR) system, where surgeons can practice procedures on virtual anatomies, is a scalable and cost-effective alternative to cadaveric training. The fully digitized virtual surgeries can also be used to assess the surgeon's skills using measurements that are otherwise hard to collect in reality. Thus, we present the Fully Immersive Virtual Reality System (FIVRS) for skull-b… ▽ More

    Submitted 31 May, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

    Comments: IPCAI/IJCARS 2023

  19. arXiv:2211.11863  [pdf, other

    cs.HC cs.CV cs.RO

    Twin-S: A Digital Twin for Skull-base Surgery

    Authors: Hongchao Shu, Ruixing Liang, Zhaoshuo Li, Anna Goodridge, Xiangyu Zhang, Hao Ding, Nimesh Nagururu, Manish Sahu, Francis X. Creighton, Russell H. Taylor, Adnan Munawar, Mathias Unberath

    Abstract: Purpose: Digital twins are virtual interactive models of the real world, exhibiting identical behavior and properties. In surgical applications, computational analysis from digital twins can be used, for example, to enhance situational awareness. Methods: We present a digital twin framework for skull-base surgeries, named Twin-S, which can be integrated within various image-guided interventions se… ▽ More

    Submitted 6 May, 2023; v1 submitted 21 November, 2022; originally announced November 2022.

  20. arXiv:2204.11116  [pdf, other

    cs.RO

    Human-Robot Shared Control for Surgical Robot Based on Context-Aware Sim-to-Real Adaptation

    Authors: Dandan Zhang, Zicong Wu, Junhong Chen, Ruiqi Zhu, Adnan Munawar, Bo Xiao, Yuan Guan, Hang Su, Wuzhou Hong, Yao Guo, Gregory S. Fischer, Benny Lo, Guang-Zhong Yang

    Abstract: Human-robot shared control, which integrates the advantages of both humans and robots, is an effective approach to facilitate efficient surgical operation. Learning from demonstration (LfD) techniques can be used to automate some of the surgical subtasks for the construction of the shared control framework. However, a sufficient amount of data is required for the robot to learn the manoeuvres. Usi… ▽ More

    Submitted 4 June, 2022; v1 submitted 23 April, 2022; originally announced April 2022.

    Comments: Accepted by 2022ICRA

  21. Virtual Reality for Synergistic Surgical Training and Data Generation

    Authors: Adnan Munawar, Zhaoshuo Li, Punit Kunjam, Nimesh Nagururu, Andy S. Ding, Peter Kazanzides, Thomas Looi, Francis X. Creighton, Russell H. Taylor, Mathias Unberath

    Abstract: Surgical simulators not only allow planning and training of complex procedures, but also offer the ability to generate structured data for algorithm development, which may be applied in image-guided computer assisted interventions. While there have been efforts on either developing training platforms for surgeons or data generation engines, these two features, to our knowledge, have not been offer… ▽ More

    Submitted 15 November, 2021; originally announced November 2021.

    Comments: MICCAI 2021 AE-CAI "Outstanding Paper Award" Code: https://github.com/LCSR-SICKKIDS/volumetric_drilling

  22. arXiv:2110.10973  [pdf, other

    cs.AI cs.CL cs.LG cs.RO

    LOA: Logical Optimal Actions for Text-based Interaction Games

    Authors: Daiki Kimura, Subhajit Chaudhury, Masaki Ono, Michiaki Tatsubori, Don Joven Agravante, Asim Munawar, Akifumi Wachi, Ryosuke Kohita, Alexander Gray

    Abstract: We present Logical Optimal Actions (LOA), an action decision architecture of reinforcement learning applications with a neuro-symbolic framework which is a combination of neural network and symbolic knowledge acquisition approach for natural language interaction games. The demonstration for LOA experiments consists of a web-based interactive platform for text-based games and visualization for acqu… ▽ More

    Submitted 21 October, 2021; originally announced October 2021.

    Comments: ACL-IJCNLP 2021 (demo paper)

  23. arXiv:2110.10963  [pdf, other

    cs.AI cs.CL cs.LG cs.RO

    Neuro-Symbolic Reinforcement Learning with First-Order Logic

    Authors: Daiki Kimura, Masaki Ono, Subhajit Chaudhury, Ryosuke Kohita, Akifumi Wachi, Don Joven Agravante, Michiaki Tatsubori, Asim Munawar, Alexander Gray

    Abstract: Deep reinforcement learning (RL) methods often require many trials before convergence, and no direct interpretability of trained policies is provided. In order to achieve fast convergence and interpretability for the policy in RL, we propose a novel RL method for text-based games with a recent neuro-symbolic framework called Logical Neural Network, which can learn symbolic and interpretable rules… ▽ More

    Submitted 21 October, 2021; originally announced October 2021.

    Comments: EMNLP 2021 (main conference)

  24. arXiv:2103.02363  [pdf, other

    cs.AI

    Reinforcement Learning with External Knowledge by using Logical Neural Networks

    Authors: Daiki Kimura, Subhajit Chaudhury, Akifumi Wachi, Ryosuke Kohita, Asim Munawar, Michiaki Tatsubori, Alexander Gray

    Abstract: Conventional deep reinforcement learning methods are sample-inefficient and usually require a large number of training trials before convergence. Since such methods operate on an unconstrained action set, they can lead to useless actions. A recent neuro-symbolic framework called the Logical Neural Networks (LNNs) can simultaneously provide key-properties of both neural networks and symbolic logic.… ▽ More

    Submitted 3 March, 2021; originally announced March 2021.

    Comments: KBRL Workshop at IJCAI-PRICAI 2020

  25. arXiv:2012.05908  [pdf, ps, other

    eess.AS cs.LG cs.SD

    Ensemble of Discriminators for Domain Adaptation in Multiple Sound Source 2D Localization

    Authors: Guillaume Le Moing, Don Joven Agravante, Tadanobu Inoue, Jayakorn Vongkulbhisal, Asim Munawar, Ryuki Tachibana, Phongtharin Vinayavekhin

    Abstract: This paper introduces an ensemble of discriminators that improves the accuracy of a domain adaptation technique for the localization of multiple sound sources. Recently, deep neural networks have led to promising results for this task, yet they require a large amount of labeled data for training. Recording and labeling such datasets is very costly, especially because data needs to be diverse enoug… ▽ More

    Submitted 16 March, 2021; v1 submitted 10 December, 2020; originally announced December 2020.

    Comments: Please refer to final version at arXiv:2012.05533

  26. arXiv:2012.05533  [pdf, ps, other

    eess.AS cs.LG cs.SD

    Data-Efficient Framework for Real-world Multiple Sound Source 2D Localization

    Authors: Guillaume Le Moing, Phongtharin Vinayavekhin, Don Joven Agravante, Tadanobu Inoue, Jayakorn Vongkulbhisal, Asim Munawar, Ryuki Tachibana

    Abstract: Deep neural networks have recently led to promising results for the task of multiple sound source localization. Yet, they require a lot of training data to cover a variety of acoustic conditions and microphone array layouts. One can leverage acoustic simulators to inexpensively generate labeled training data. However, models trained on synthetic data tend to perform poorly with real-world recordin… ▽ More

    Submitted 17 March, 2021; v1 submitted 10 December, 2020; originally announced December 2020.

    Comments: Accepted to IEEE ICASSP 2021. This article supersedes arXiv:2012.05908

  27. arXiv:2012.05515  [pdf, ps, other

    eess.AS cs.LG cs.SD

    Learning Multiple Sound Source 2D Localization

    Authors: Guillaume Le Moing, Phongtharin Vinayavekhin, Tadanobu Inoue, Jayakorn Vongkulbhisal, Asim Munawar, Ryuki Tachibana, Don Joven Agravante

    Abstract: In this paper, we propose novel deep learning based algorithms for multiple sound source localization. Specifically, we aim to find the 2D Cartesian coordinates of multiple sound sources in an enclosed environment by using multiple microphone arrays. To this end, we use an encoding-decoding architecture and propose two improvements on it to accomplish the task. In addition, we also propose two nov… ▽ More

    Submitted 10 December, 2020; originally announced December 2020.

    Comments: Published in: 2019 IEEE 21st International Workshop on Multimedia Signal Processing (MMSP)

  28. arXiv:2009.11896  [pdf, other

    cs.LG cs.CL stat.ML

    Bootstrapped Q-learning with Context Relevant Observation Pruning to Generalize in Text-based Games

    Authors: Subhajit Chaudhury, Daiki Kimura, Kartik Talamadupula, Michiaki Tatsubori, Asim Munawar, Ryuki Tachibana

    Abstract: We show that Reinforcement Learning (RL) methods for solving Text-Based Games (TBGs) often fail to generalize on unseen games, especially in small data regimes. To address this issue, we propose Context Relevant Episodic State Truncation (CREST) for irrelevant token removal in observation text for improved generalization. Our method first trains a base model using Q-learning, which typically overf… ▽ More

    Submitted 24 September, 2020; originally announced September 2020.

    Comments: Accepted to EMNLP 2020

  29. Unsupervised Temporal Feature Aggregation for Event Detection in Unstructured Sports Videos

    Authors: Subhajit Chaudhury, Daiki Kimura, Phongtharin Vinayavekhin, Asim Munawar, Ryuki Tachibana, Koji Ito, Yuki Inaba, Minoru Matsumoto, Shuji Kidokoro, Hiroki Ozaki

    Abstract: Image-based sports analytics enable automatic retrieval of key events in a game to speed up the analytics process for human experts. However, most existing methods focus on structured television broadcast video datasets with a straight and fixed camera having minimum variability in the capturing pose. In this paper, we study the case of event detection in sports videos for unstructured environment… ▽ More

    Submitted 19 February, 2020; originally announced February 2020.

    Comments: Accepted to IEEE International Symposium on Multimedia, 2019

  30. arXiv:1912.07834  [pdf, other

    cs.AI cs.LO cs.RO

    Design and Implementation of Linked Planning Domain Definition Language

    Authors: Michiaki Tatsubori, Asim Munawar, Takao Moriyama

    Abstract: Planning is a critical component of any artificial intelligence system that concerns the realization of strategies or action sequences typically for intelligent agents and autonomous robots. Given predefined parameterized actions, a planning service should accept a query with the goal and initial state to give a solution with a sequence of actions applied to environmental objects. This paper addre… ▽ More

    Submitted 17 December, 2019; originally announced December 2019.

    Comments: 17 pages

  31. arXiv:1903.09798  [pdf, other

    cs.CV

    Spatially-weighted Anomaly Detection with Regression Model

    Authors: Daiki Kimura, Minori Narita, Asim Munawar, Ryuki Tachibana

    Abstract: Visual anomaly detection is common in several applications including medical screening and production quality check. Although a definition of the anomaly is an unknown trend in data, in many cases some hints or samples of the anomaly class can be given in advance. Conventional methods cannot use the available anomaly data, and also do not have a robustness of noise. In this paper, we propose a nov… ▽ More

    Submitted 28 March, 2019; v1 submitted 23 March, 2019; originally announced March 2019.

    Comments: 4 pages, published as an oral presentation paper at Meeting on Image Recognition and Understanding (MIRU) 2018

  32. A Convex Optimization-based Dynamic Model Identification Package for the da Vinci Research Kit

    Authors: Yan Wang, Radian Gondokaryono, Adnan Munawar, Gregory S. Fischer

    Abstract: The da Vinci Research Kit (dVRK) is a teleoperated surgical robotic system. For dynamic simulations and model-based control, the dynamic model of the dVRK is required. We present an open-source dynamic model identification package for the dVRK, capable of modeling the parallelograms, springs, counterweight, and tendon couplings, which are inherent to the dVRK. A convex optimization-based method is… ▽ More

    Submitted 25 July, 2019; v1 submitted 27 February, 2019; originally announced February 2019.

    Comments: 8 pages, in IEEE Robotics and Automation Letters., 2019

  33. arXiv:1810.01108  [pdf, other

    cs.LG cs.CV stat.ML

    Injective State-Image Mapping facilitates Visual Adversarial Imitation Learning

    Authors: Subhajit Chaudhury, Daiki Kimura, Asim Munawar, Ryuki Tachibana

    Abstract: The growing use of virtual autonomous agents in applications like games and entertainment demands better control policies for natural-looking movements and actions. Unlike the conventional approach of hard-coding motion routines, we propose a deep learning method for obtaining control policies by directly mimicking raw video demonstrations. Previous methods in this domain rely on extracting low-di… ▽ More

    Submitted 25 October, 2019; v1 submitted 2 October, 2018; originally announced October 2018.

    Comments: Updated the paper to match with version accepted at IEEE MMSP 2019

  34. arXiv:1809.08925  [pdf, other

    cs.LG cs.AI

    Constrained Exploration and Recovery from Experience Shaping

    Authors: Tu-Hoa Pham, Giovanni De Magistris, Don Joven Agravante, Subhajit Chaudhury, Asim Munawar, Ryuki Tachibana

    Abstract: We consider the problem of reinforcement learning under safety requirements, in which an agent is trained to complete a given task, typically formalized as the maximization of a reward signal over time, while concurrently avoiding undesirable actions or states, associated to lower rewards, or penalties. The construction and balancing of different reward components can be difficult in the presence… ▽ More

    Submitted 21 September, 2018; originally announced September 2018.

    Comments: Code: https://github.com/IBM/constrained-rl

  35. arXiv:1809.04232  [pdf, other

    cs.AI

    Safe Exploration in Markov Decision Processes with Time-Variant Safety using Spatio-Temporal Gaussian Process

    Authors: Akifumi Wachi, Hiroshi Kajino, Asim Munawar

    Abstract: In many real-world applications (e.g., planetary exploration, robot navigation), an autonomous agent must be able to explore a space with guaranteed safety. Most safe exploration algorithms in the field of reinforcement learning and robotics have been based on the assumption that the safety features are a priori known and time-invariant. This paper presents a learning algorithm called ST-SafeMDP f… ▽ More

    Submitted 11 September, 2018; originally announced September 2018.

    Comments: 12 pages, 7 figures

  36. arXiv:1808.02200  [pdf, other

    cs.RO

    Deep Learning with Predictive Control for Human Motion Tracking

    Authors: Don Joven Agravante, Giovanni De Magistris, Asim Munawar, Phongtharin Vinayavekhin, Ryuki Tachibana

    Abstract: We propose to combine model predictive control with deep learning for the task of accurate human motion tracking with a robot. We design the MPC to allow switching between the learned and a conservative prediction. We also explored online learning with a DyBM model. We applied this method to human handwriting motion tracking with a UR-5 robot. The results show that the framework significantly impr… ▽ More

    Submitted 6 August, 2018; originally announced August 2018.

    Comments: To appear in 36th Annual Conference of the Robotics Society of Japan (RSJ 2018)

  37. arXiv:1807.06749  [pdf, other

    cs.RO

    Experimental Force-Torque Dataset for Robot Learning of Multi-Shape Insertion

    Authors: Giovanni De Magistris, Asim Munawar, Tu-Hoa Pham, Tadanobu Inoue, Phongtharin Vinayavekhin, Ryuki Tachibana

    Abstract: The accurate modeling of real-world systems and physical interactions is a common challenge towards the resolution of robotics tasks. Machine learning approaches have demonstrated significant results in the modeling of complex systems (e.g., articulated robot structures, cable stretch, fluid dynamics), or to learn robotics tasks (e.g., grasping, reaching) from raw sensor measurements without expli… ▽ More

    Submitted 25 July, 2018; v1 submitted 17 July, 2018; originally announced July 2018.

    Comments: video at: https://youtu.be/6rLc9fAtzAQ 36th Annual Conference of the Robotics Society of Japan (RSJ 2018), Kasugai, Japan, 2018

  38. arXiv:1806.08523  [pdf, ps, other

    cs.CV

    Focusing on What is Relevant: Time-Series Learning and Understanding using Attention

    Authors: Phongtharin Vinayavekhin, Subhajit Chaudhury, Asim Munawar, Don Joven Agravante, Giovanni De Magistris, Daiki Kimura, Ryuki Tachibana

    Abstract: This paper is a contribution towards interpretability of the deep learning models in different applications of time-series. We propose a temporal attention layer that is capable of selecting the relevant information to perform various tasks, including data completion, key-frame detection and classification. The method uses the whole input sequence to calculate an attention value for each time step… ▽ More

    Submitted 22 June, 2018; originally announced June 2018.

    Comments: To appear in ICPR 2018

  39. arXiv:1806.00802  [pdf, other

    cs.RO

    MaestROB: A Robotics Framework for Integrated Orchestration of Low-Level Control and High-Level Reasoning

    Authors: Asim Munawar, Giovanni De Magistris, Tu-Hoa Pham, Daiki Kimura, Michiaki Tatsubori, Takao Moriyama, Ryuki Tachibana, Grady Booch

    Abstract: This paper describes a framework called MaestROB. It is designed to make the robots perform complex tasks with high precision by simple high-level instructions given by natural language or demonstration. To realize this, it handles a hierarchical structure by using the knowledge stored in the forms of ontology and rules for bridging among different levels of instructions. Accordingly, the framewor… ▽ More

    Submitted 3 June, 2018; originally announced June 2018.

    Comments: IEEE International Conference on Robotics and Automation (ICRA) 2018. Video: https://www.youtube.com/watch?v=19JsdZi0TWU

  40. arXiv:1708.08985  [pdf, ps, other

    cs.CV cs.AI cs.LG

    Limiting the Reconstruction Capability of Generative Neural Network using Negative Learning

    Authors: Asim Munawar, Phongtharin Vinayavekhin, Giovanni De Magistris

    Abstract: Generative models are widely used for unsupervised learning with various applications, including data compression and signal restoration. Training methods for such systems focus on the generality of the network given limited amount of training data. A less researched type of techniques concerns generation of only a single type of input. This is useful for applications such as constraint handling,… ▽ More

    Submitted 15 August, 2017; originally announced August 2017.

    Comments: Conference: IEEE International Workshop on Machine Learning for Signal Processing (MLSP), Roppongi, Tokyo, Japan, September 25-28, 2017

  41. arXiv:1708.04033  [pdf, other

    cs.RO cs.AI

    Deep Reinforcement Learning for High Precision Assembly Tasks

    Authors: Tadanobu Inoue, Giovanni De Magistris, Asim Munawar, Tsuyoshi Yokoya, Ryuki Tachibana

    Abstract: High precision assembly of mechanical parts requires accuracy exceeding the robot precision. Conventional part mating methods used in the current manufacturing requires tedious tuning of numerous parameters before deployment. We show how the robot can successfully perform a tight clearance peg-in-hole task through training a recurrent neural network with reinforcement learning. In addition to savi… ▽ More

    Submitted 21 September, 2017; v1 submitted 14 August, 2017; originally announced August 2017.

    Comments: Conference: Accepted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, Canada, September 24-28, 2017. Video: https://youtu.be/b2pC78rBGH4

  42. arXiv:1707.00860  [pdf, other

    cs.LG cs.AI cs.CV

    Conditional generation of multi-modal data using constrained embedding space mapping

    Authors: Subhajit Chaudhury, Sakyasingha Dasgupta, Asim Munawar, Md. A. Salam Khan, Ryuki Tachibana

    Abstract: We present a conditional generative model that maps low-dimensional embeddings of multiple modalities of data to a common latent space hence extracting semantic relationships between them. The embedding specific to a modality is first extracted and subsequently a constrained optimization procedure is performed to project the two embedding spaces to a common manifold. The individual embeddings are… ▽ More

    Submitted 25 July, 2017; v1 submitted 4 July, 2017; originally announced July 2017.

    Comments: 7 pages, 4 figures, ICML 2017 Workshop on Implicit Models