Current Projects

Electro-Photonic Computing (EPiC) for On-Premise Applications (funded by IARPA)

The main objective of the project is to develop a complete end-to-end high-performance DNN system for on-premise computing applications—mainly for a SWaP-constrained Autonomous Vehicle—using hybrid electro-photonic accelerators. We propose to design and prototype a complete electro-photonic computing (EPiC) system (CPUs + accelerators), integrate it with the sensors in AV, and demonstrate its capability to perform perception, mapping, and planning while overcoming the power and performance limitations of CMOS-only computers. As our end goal, we plan to demonstrate a fully autonomous buggy that uses our EPiC system.

Recent Related Papers – [ARXIV 2023], [JETC 2023] [HOTCHIPS 2022], [ARXIV 2021]

Privacy-Preserving Computing using Fully Homomorphic Encryption (funded by RedHat, NSF)

The high-level goal of this project is to design a complete end-to-end solution for performing privacy-preserving computing using Fully Homomorphic Encryption. We are exploring the design of hardware accelerators for accelerating the various individual primitives like modular addition and multiplication, complex operations like bootstrapping, and complete FHE-based applications.

Recent Related Papers – [MICRO 2023], [MICRO 2023], [TVLSI 2023], [HPCA 2023], [SEED 2022], [ARXIV 2021]

Network and Memory Architectures for Manycore/GPU Systems (funded by NSF, DARPA)

The high-level goal of this project is to develop novel electrical/silicon-photonic network architectures and memory architectures for manycore processors and GPUs. This project has multiple sub-projects – 1) The first sub-project focuses on developing run-time system-level power management techniques for managing laser power and thermal-tuning power of silicon-photonic networks; 2) The second sub-project focuses on designing new network and memory systems for next-generation multi-GPU systems; 3) The third sub-project focuses on the design of novel cross-layer design automation methods for 2.5D-integrated heterogeneous systems with electrical and silicon-photonic networks; and 4) The fourth sub-project focuses on integrating Optically-controlled Phase Change Memory technology and silicon-photonic link technology to achieve a “one-stop-shop” solution that provides seamless high-bandwidth communication between manycore/GPU processors and high-density memory.

Recent Related Papers – [TACO 2022], [PACT 2022], [DATE 2021], [TCAD 2020], [HPCA 2020], [PACT 2020], [DATE 2020][ISCA 2019]

Taming Memory Corruption with Security Monitors (funded by NSF, Google)

The overarching goal of this project is to secure processors against memory corruption-based exploits. Using a RISC-V processor as our example target system, we propose a decoupled modular security monitor that uses an array of hardware units called policy engines to monitor the application execution in the RISC-V processor. The processor can have one or more monitors, where each monitor hosts both programmable policy engines (PPEs) that offer flexibility at the cost of efficiency, and specialized policy engines (SPEs) that implement a fixed policy with higher energy efficiency. These policy engines implement a variety of security enforcement mechanisms, including memory safety and fine-grained protection domains.

Recent Related Papers – [ACSAC 2021], [DATE 2021], [USENIX SECURITY 2020][DIMVA 2020], [ASIACCS 2018]

Past Projects

BlackParrot – An Open-Source RISC-V Multicore Processor (funded by DARPA)

The goal of this project is to design and open-source a Linux-capable, cache-coherent, RV64GC multicore processor. This processor is currently being developed by the University of Washington and Boston University, but it strives to be community-driven and infrastructure-agnostic core, which is Pareto optimal in terms of power, performance, area, and complexity. BlackParrot is ideal as the basis for a lightweight accelerator host, a standalone Linux core or as a hardware research platform.

Recent Related Papers – [HOST 2023], [DATE 2023], [DAC 2021], [IEEE MICRO 2020]

Securing CMOS Integrated Circuits Using Nanoantenna-based Optical Watermarks (funded by Honeywell, NSF)

The objective of this project is to develop and demonstrate optical nanoantennas as an optical watermarking technology, which can be used to rapidly detect any insertion of malicious hardware Trojans in CMOS IC chips. We propose to strategically embed predefined structures in one or more layers of the metal stack within each standard cell while developing the standard cell library. These metal nanostructures can be engineered to produce unique optical signatures that are a function of their design and surrounding environment. Any modifications in the form of replacement or re-arrangement of existing cells to add a Trojan can be detected through rapid post-fabrication backside imaging.

Recent Related Papers – [TCAD 2021], [ACCESS 2020], [DAC 2015]

Hardware Accelerators for Programmable Smart Machines (funded by NSF)

The goal of this project is to develop novel hardware accelerator architectures that can be used to accelerate both general-purpose and special-purpose applications. This effort is part of a larger project whose overarching goal is to develop Programmable Smart Machines (PSMs). PSMs are hybrid computing systems that behave as programmed but transparently learn and automatically improve their operation.

Plastic neuromorphic hardware for autonomous navigation in mobile robots (funded by NSF and NASA)

The goal of this project is to develop and translate adaptive neural models into custom neuromorphic hardware for autonomous learning in sensory, motivational, planning, and reinforcement circuits in mobile robots. We adopt an integrated approach based on the joint optimization of neural algorithms and hardware architectures to arrive at low-power high-performance solutions. In terms of neural modeling, we will develop efficient ego-motion estimation and reinforcement learning modules that are cognizant of the limitations of hardware implementation. Issues such as locality of computation, dynamic coding, and processing load will be addressed, resulting in models that better exploit the hardware characteristics to allow real-time processing and learning in freely behaving robotic agents.

Wave-pipelined Multiplexed Routing for Gigascale Integration (funded by NSF)

The main objective of this project was to develop a pervasive wire-sharing technique – wave-pipelined multiplexed routing, that can be easily applied across the entire range of on-chip interconnects. A circuit-level, system-level, and physical-level analysis was completed to explore the limits and opportunities to apply WPM routing to gigascale integration (GSI) systems. Design, verification, and optimization of the WPM circuit and measurement of its tolerance to external noise constituted the circuit-level analysis. The physical-level study involved designing wire-sharing-aware placement algorithms to maximize the advantages of WPM routing. A system-level simulator that designs the entire multilevel interconnect network was developed to perform the system-level analysis. The effect of WPM routing on a full-custom interconnect network and a semi-custom interconnect network was studied.

Carbon Nanotubes Interconnects in VLSI applications (funded by SRC IFC)

In this project, we explored the potential of using carbon nanotubes for on-chip communication. Based on the critical parameters of carbon nanotubes a methodology for interconnect sizing in terms of power, performance, and area was developed. These circuit-level optimizations were then extrapolated to the system level. At the system level, the application of carbon nanotube interconnects for core-to-core communication in multi-core systems was studied. Various data routing strategies were investigated for a range of loads to identify the best possible configuration for carbon nanotubes.

Low complexity decoding algorithm for Reed-Solomon code (funded by NSF)

In this project, we developed a new low-complexity chase decoding algorithm for decoding Reed-Solomon codes of various lengths. A joint optimization of the decoding algorithm and its hardware implementation was performed to develop an integrated solution.

Next-Generation Solid Immersion Microscopy for Fault Isolation in Back-Side Analysis (funded by IARPA)

The rapid decrease in the dimensions of integrated circuits has necessitated corresponding higher-resolution methods for fault isolation and localization. Current state-of-the-art, defect imaging systems are reaching the limits of their resolution. Our goal is to investigate the effects of decrease in the IC dimensions on fault localization measurements by modeling the interaction of highly focused optical beams and nanoscale semiconductor integrated circuits. To this end, we are building an electromagnetic model that takes into account various parameters including polarization of light, numerical aperture, doping concentration, voltage level and circuit dimensions; to obtain a simulated image of the circuit, which would then we verified against the experimental data.

Designing Digital CMOS Logic Circuits using Equalization (funded by BU)

The objective of this project is to combine low-power circuit techniques with ideas from information/coding theory to design reliable and energy-efficient digital CMOS logic circuits operating in the sub-threshold and near-threshold regime. In particular, we are exploring the use of feedback equalization techniques to dynamically change the switching threshold of the logic gates (based on the switching profiles in the previous clock cycle) in a digital sequential logic block. This dynamic change of switching threshold can be leveraged to mitigate process variation effects and/or reduce critical path delay to improve the reliability and energy efficiency of the digital sequential logic block.