Skip to main content

Next Generation Automatic Memory Management


Modern object-oriented programming languages such as Java, JavaScript, Ruby, and C# are becoming ubiquitous. A primary reason for this trend is that these languages provide automatic memory management (garbage collection), which relieves programmers of the burden of explicitly freeing memory that is no longer needed. Professor Kathryn McKinley at the University of Texas at Austin has led an NSF-funded research project, in collaboration with Steve Blackburn at the Australian National University, that is exploring how to build the software infrastructure that executes managed programs, i.e., programs in languages that provide automatic memory management. Garbage collection provides a number of software engineering benefits such as preventing common programmer memory errors that are among the most difficult to diagnose and fix. However, in the past, programs in garbage collected languages tended to be slower. The garbage collector makes a classic time-space tradeoff that seeks to provide space efficiency, fast reclamation of objects no longer in use, and fast run-time performance by packing contemporaneously-allocated objects together in space. The three canonical tracing garbage collectors: semi-space, mark-sweep, and mark-compact each sacrifice one of these objectives.
The PIs also introduced opportunistic defragmentation, which mixes copying and marking in a single pass. Combining both, we implement immix a novel high performance garbage collector that achieves all three performance objectives. The key insight is to allocate and reclaim memory hierarchically at a coarse block grain when possible and otherwise divide blocks in to finer grain lines, similar to pages and cache lines in hardware memory systems. It is shown that immix outperforms existing canonical algorithms, improving total application performance by 7 to 25% on average across 20 benchmarks. As the mature space in a generational collector, immix matches or beats a highly tuned generational collector, e.g., it improves SPECjbb200 by 5%. These innovations and the identification of a new family of collectors open new opportunities for garbage collector design.
Anurag

Comments

Popular posts from this blog

Standard and Formatted Input / Output in C++

The C++ standard libraries provide an extensive set of input/output capabilities which we will see in subsequent chapters. This chapter will discuss very basic and most common I/O operations required for C++ programming. C++ I/O occurs in streams, which are sequences of bytes. If bytes flow from a device like a keyboard, a disk drive, or a network connection etc. to main memory, this is called   input operation   and if bytes flow from main memory to a device like a display screen, a printer, a disk drive, or a network connection, etc., this is called   output operation . Standard Input and Output in C++ is done through the use of  streams . Streams are generic places to send or receive data. In C++, I/O is done through classes and objects defined in the header file  <iostream> .  iostream  stands for standard input-output stream. This header file contains definitions to objects like  cin ,  cout , etc. /O Library Header Files There are...

locking

DBMS Locking Part I (DBMS only) TECHNICAL ARTICLES -> PERFORMANCE ARTICLES [  Back  ] [  Next  ] DBMS is often criticized for excessive locking – resulting in poor database performance when sharing data among multiple concurrent processes. Is this criticism justified, or is DBMS being unfairly blamed for application design and implementation shortfalls? To evaluate this question, we need to understand more about DBMS locking protocols. In this article, we examine how, why, what and when DBMS locks and unlocks database resources. Future articles will address how to minimize the impact of database locking. THE NEED FOR LOCKING In an ideal concurrent environment, many processes can simultaneously access data in a DBMS database, each having the appearance that they have exclusive access to the database. In practice, this environment is closely approximated by careful use of locking protocols. Locking is necessary in a concurrent environment to as...

DATA WAREHOUSE VERSUS DATA MART: THE GREAT DEBATE

DATA WAREHOUSE VERSUS DATA MART: THE GREAT DEBATE Customers exploring the field of business intelligence for the first time often lead with: What is the difference between a data warehouse and a data mart? The next question follows as predictably as night follows day: which one does my company need? Let me start by saying that the two terms are often confused. Indeed, some people in the industry use them virtually interchangeably, which is unfortunate, because they do reflect a valuable hierarchical difference. The Data Warehouse A "data warehouse" will typically contain the full range of business intelligence available to a company from all sources. That data consists of transaction-processing records, corporate and marketing data, and other business operations information; for example, a bank might include loans, credit card statements, and demand deposits data, along with basic customer information. This internal data is frequently combined with statistica...