Ham et al., 2024 - Google Patents
ONNXim: A Fast, Cycle-level Multi-core NPU SimulatorHam et al., 2024
View PDF- Document ID
- 6027427457557784741
- Author
- Ham H
- Yang W
- Shin Y
- Woo O
- Heo G
- Lee S
- Park J
- Kim G
- Publication year
- Publication venue
- arXiv preprint arXiv:2406.08051
External Links
Snippet
As DNNs are widely adopted in various application domains while demanding increasingly higher compute and memory requirements, designing efficient and performant NPUs (Neural Processing Units) is becoming more important. However, existing architectural NPU …
- 238000004088 simulation 0 abstract description 40
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
- G06F17/5022—Logic simulation, e.g. for logic circuit operation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5045—Circuit design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G06F15/163—Interprocessor communication
- G06F15/173—Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2217/00—Indexing scheme relating to computer aided design [CAD]
- G06F2217/78—Power analysis and optimization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2217/00—Indexing scheme relating to computer aided design [CAD]
- G06F2217/86—Hardware-Software co-design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2217/00—Indexing scheme relating to computer aided design [CAD]
- G06F2217/68—Processors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2217/00—Indexing scheme relating to computer aided design [CAD]
- G06F2217/70—Fault tolerant, i.e. transient fault suppression
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F1/00—Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/14—Handling requests for interconnection or transfer
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
Similar Documents
Publication | Publication Date | Title |
---|---|---|
You et al. | Fast deep neural network training on distributed systems and cloud TPUs | |
Bhimani et al. | Fim: performance prediction for parallel computation in iterative data processing applications | |
Cassidy et al. | Beyond Amdahl's law: an objective function that links multiprocessor performance gains to delay and energy | |
Jamshed | Using HPC for Computational Fluid Dynamics: A guide to high performance computing for CFD engineers | |
CN116384312B (en) | Circuit yield analysis method based on parallel heterogeneous computation | |
Ahmed et al. | An integrated interconnection network model for large-scale performance prediction | |
Cho et al. | FARNN: FPGA-GPU hybrid acceleration platform for recurrent neural networks | |
Ramaswamy et al. | Scalable behavioral emulation of extreme-scale systems using structural simulation toolkit | |
Everaars et al. | Using coordination to parallelize sparse-grid methods for 3-d cfd problems | |
Ham et al. | ONNXim: A Fast, Cycle-level Multi-core NPU Simulator | |
Sukkari et al. | Asynchronous task-based polar decomposition on single node manycore architectures | |
Ma et al. | Coordinated DMA: improving the DRAM access efficiency for matrix multiplication | |
US20200371843A1 (en) | Framework for application driven exploration and optimization of hardware engines | |
Zhou et al. | Pim-dl: Boosting dnn inference on digital processing in-memory architectures via data layout optimizations | |
Lenormand et al. | A combined fast/cycle accurate simulation tool for reconfigurable accelerator evaluation: application to distributed data management | |
Qureshi et al. | Genome sequence alignment-design space exploration for optimal performance and energy architectures | |
Arasteh | Transaction-level modeling of deep neural networks for efficient parallelism and memory accuracy | |
Huang et al. | A novel multi-CPU/GPU collaborative computing framework for SGD-based matrix factorization | |
Zhang et al. | Implementation and efficiency analysis of parallel computation using OpenACC: a case study using flow field simulations | |
Popov et al. | Teragraph heterogeneous system for ultra-large graph processing | |
Lantreibecq et al. | Model checking and co-simulation of a dynamic task dispatcher circuit using CADP | |
Spinnato et al. | Performance modeling of distributed hybrid architectures | |
US20240354479A1 (en) | Peformance analysis using architecture model of processor architecture design | |
Pallipuram et al. | A regression‐based performance prediction framework for synchronous iterative algorithms on general purpose graphical processing unit clusters | |
Nguyen et al. | A new approach and tool in verifying asynchronous circuits |