More with Less – Deriving More Translation Rules with Less Training Data for DBTs Using Parameterization [conference paper]

Conference

2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) - October 17-21, 2020

Authors

Jinhu Jiang, Rongchao Dong, Zhongjun Zhou, Changheng Song, Wenwen Wang, Pen-Chung Yew (professor), Weihua Zhang

Abstract

Dynamic binary translation (DBT) is widely used in system virtualization and many other important applications. To achieve a higher translation quality, a learning-based approach has been recently proposed to automatically learn semanticallyequivalent translation rules. Because translation rules directly impact the quality and performance of the translated host codes, one of the key issues is to collect as many translation rules as possible through minimal training data set. The collected translation rules should also cover (i.e. apply to) as many guest binary instructions or code sequences as possible at the runtime. For those guest binary instructions that are not covered by the learned rules, emulation has to be used, which will incur additional runtime overhead. Prior learning-based DBT systems only achieve an average of about 69% dynamic code coverage for SPEC CINT 2006. In this paper, we propose a novel parameterization approach to take advantage of the regularity and the well-structured format in most modern ISAs. It allows us to extend the learned translation rules to include instructions or instruction sequences of similar structures or characteristics that are not covered in the training set. More translation rules can thus be harvested from the same training set. Experimental results on QEMU 4.1 show that using such a parameterization approach we can expand the learned 2,724 rules to 86,423 applicable rules for SPEC CINT 2006. Its code coverage can also be expanded from about 69.7% to about 95.5% with a 24% performance improvement compared to enhanced learning-based approach.

Link to full paper

More with Less – Deriving More Translation Rules with Less Training Data for DBTs Using Parameterization

Keywords

architectures

Share