Ham et al., 2024 - Google Patents

ONNXim: A Fast, Cycle-level Multi-core NPU Simulator

Ham et al., 2024

View PDF
Document ID
6027427457557784741
Author
Ham H
Yang W
Shin Y
Woo O
Heo G
Lee S
Park J
Kim G
Publication year
Publication venue
arXiv preprint arXiv:2406.08051

External Links

Snippet

As DNNs are widely adopted in various application domains while demanding increasingly higher compute and memory requirements, designing efficient and performant NPUs (Neural Processing Units) is becoming more important. However, existing architectural NPU …
Continue reading at arxiv.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • G06F17/5009Computer-aided design using simulation
    • G06F17/5022Logic simulation, e.g. for logic circuit operation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30386Retrieval requests
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • G06F17/5045Circuit design
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored programme computers
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F2217/00Indexing scheme relating to computer aided design [CAD]
    • G06F2217/78Power analysis and optimization
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F2217/00Indexing scheme relating to computer aided design [CAD]
    • G06F2217/86Hardware-Software co-design
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F2217/00Indexing scheme relating to computer aided design [CAD]
    • G06F2217/68Processors
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F2217/00Indexing scheme relating to computer aided design [CAD]
    • G06F2217/70Fault tolerant, i.e. transient fault suppression
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F1/00Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformations of program code
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass

Similar Documents

Publication Publication Date Title
You et al. Fast deep neural network training on distributed systems and cloud TPUs
Bhimani et al. Fim: performance prediction for parallel computation in iterative data processing applications
Cassidy et al. Beyond Amdahl's law: an objective function that links multiprocessor performance gains to delay and energy
Jamshed Using HPC for Computational Fluid Dynamics: A guide to high performance computing for CFD engineers
CN116384312B (en) Circuit yield analysis method based on parallel heterogeneous computation
Ahmed et al. An integrated interconnection network model for large-scale performance prediction
Cho et al. FARNN: FPGA-GPU hybrid acceleration platform for recurrent neural networks
Ramaswamy et al. Scalable behavioral emulation of extreme-scale systems using structural simulation toolkit
Everaars et al. Using coordination to parallelize sparse-grid methods for 3-d cfd problems
Ham et al. ONNXim: A Fast, Cycle-level Multi-core NPU Simulator
Sukkari et al. Asynchronous task-based polar decomposition on single node manycore architectures
Ma et al. Coordinated DMA: improving the DRAM access efficiency for matrix multiplication
US20200371843A1 (en) Framework for application driven exploration and optimization of hardware engines
Zhou et al. Pim-dl: Boosting dnn inference on digital processing in-memory architectures via data layout optimizations
Lenormand et al. A combined fast/cycle accurate simulation tool for reconfigurable accelerator evaluation: application to distributed data management
Qureshi et al. Genome sequence alignment-design space exploration for optimal performance and energy architectures
Arasteh Transaction-level modeling of deep neural networks for efficient parallelism and memory accuracy
Huang et al. A novel multi-CPU/GPU collaborative computing framework for SGD-based matrix factorization
Zhang et al. Implementation and efficiency analysis of parallel computation using OpenACC: a case study using flow field simulations
Popov et al. Teragraph heterogeneous system for ultra-large graph processing
Lantreibecq et al. Model checking and co-simulation of a dynamic task dispatcher circuit using CADP
Spinnato et al. Performance modeling of distributed hybrid architectures
US20240354479A1 (en) Peformance analysis using architecture model of processor architecture design
Pallipuram et al. A regression‐based performance prediction framework for synchronous iterative algorithms on general purpose graphical processing unit clusters
Nguyen et al. A new approach and tool in verifying asynchronous circuits