Summary
AbstractA system for implementing synchronized objects for software transactional memory comprises one or more processors and a memory comprising program instructions executable by the processor to implement a transactional-memory manager configured to coordinate memory access requests directed at the memory from a plurality of transactions. The transactional-memory manager records, within a collaborator record for a shared data object in the memory, identifications of a set of two or more transactions that have requested synchronization on the object. In response to a commit request from a given transaction of the set, the transactional-memory manager determines whether to commit or abort the given transaction based at least in part on the transactional states of other transactions in the set, examining the collaborator record to identify the other transactions.
Description1. Field of the Invention
The present invention is directed to computer systems. More particularly, it is directed to synchronization mechanisms for computer systems implementing transactional memory.
2. Description of the Related Art
In the field of computer systems, considerable effort has been expended on developing techniques to support concurrent access to shared resources. Mutual exclusion locks and monitors represent two traditional concurrent programming synchronization mechanisms. Locks and monitors protect shared resources by separating access to them in time; for example, in one implementation, as long as a given thread of execution retains a lock on an object or resource, no other thread of execution may modify the object, and any other thread attempting to modify the object may be blocked from further execution until the lock is released.
However, locking techniques are known to suffer from several limitations. Coarse-grained locks, which protect relatively large amounts of data, often do not scale. For example, threads of execution on a multiprocessor system may block each other even when they do not actually require concurrent access to the same data. Fine-grained locks may resolve some of these contention issues, but only at the cost of added programming complexity and the increased likelihood of problems such as deadlocks. Locking schemes may also lead to an increased vulnerability to thread failures and delays—e.g., a thread that is preempted or does expensive input/output operations while holding a lock may obstruct other threads for relatively long periods, thereby potentially reducing the overall throughput of the system.
The transactional-memory programming paradigm has been gaining momentum as an approach of choice for replacing locks in concurrent programming. In transactional-memory programming, sequences of concurrent operations may be combined into nonblocking atomic transactions, thus making parts of the code appear to be sequential without the need to use locks. Executing threads indicate transaction boundaries, e.g., by specifying when a transaction starts and when it ends, but do not have to acquire locks on any objects. Transactional-memory programming techniques may allow transactions that do not overlap in data accesses to run uninterrupted in parallel; transactions that do overlap may be aborted and retried.
The atomicity and isolation of transactions may significantly simplify concurrent programming. Atomicity is a property that guarantees that either all steps in a transaction succeed, or the entire transaction is rolled back, thus giving the impression that the transaction never occurred. Isolation refers to the ability of a transactional system to make operations in a transaction appear isolated from all other transactions or external operations. While attractive from the perspective of decreasing programming complexity, strict isolation may have some negative implications for transactional-memory systems: e.g., it may prevent interactions between transactions. Such interactions may be useful for several purposes, such as implementing condition synchronization and increasing concurrency. Synchronization techniques that allow such interactions may make it easier to implement increased parallelism within transactions, thereby potentially enhancing system-wide throughput in transactional-memory systems, and may provide a way for some transactions to share modified values of data prior to committing without compromising isolation for other transactions that do not need to share uncommitted data.
SUMMARYVarious embodiments of methods and systems for implementing synchronized objects for software transactional memory are disclosed. According to one embodiment, a system comprises one or more processors and a memory comprising program instructions executable by the processors to implement a transactional-memory manager configured to coordinate memory access requests directed at the memory from a plurality of transactions. The transactions may each comprise a set of operations, e.g., specifying memory locations at an address level, and transaction boundaries may be defined programmatically within application code. The transactional-memory manager may be configured to record, within a collaborator record for a shared data object in the memory, identifications of a set of two or more transactions that have each requested synchronization on the object (e.g., so that each transaction of the set may see uncommitted updates performed on the shared object by other transactions of the set). In response to a commit request from a given transaction of the set, the transactional-memory manager may be configured to determine whether to commit or abort the given transaction based at least in part on the transactional states of other transactions in the set, examining the collaborator record to identify the other transactions. If the transactional-memory manager determines that the given transaction is to be committed based on the states of the other transactions, it may commit the given transaction; otherwise, the given transaction may be aborted, or may have to wait until either the other transactions are all ready to commit or at least one of the other transactions aborts. By requesting synchronization on the shared data objects, the transactions of the set may indicate that isolation requirements (which may typically prevent uncommitted updates from one transaction from being seen by other transactions) be relaxed for the set of transactions and for the object or objects on which synchronization is requested by transactions in the set, thereby potentially allowing increased concurrency and inter-transaction update sharing, without affecting the operations of other transactions that may not need to relax isolation requirements. Part or all of the functionality of the transactional-memory manager may be implemented by a plurality of components in some embodiments, e.g., by the threads that participate in the transactions.
In one embodiment, the transactional-memory manager may also be configured to record, within a respective shared object record for each transaction of the set of transactions, an identification of one or more additional shared data objects on which the corresponding transaction has also requested synchronization. In response to the commit request from the given transaction, the transactional-memory manager may be configured to examine the shared data object record of the given transaction to identify additional shared data objects on which the given transaction has requested synchronization. The determination of whether to commit or abort the given transaction may then also be based on the transactional state of an additional transaction identified from the collaborator record of the additional shared data object. Thus, the set of transactions whose states may be examined in order to determine whether to commit or abort a given transaction may include the collaborator transactions for each of the shared data objects on which the given transaction has synchronized. A transaction T2 may be termed a cohort of another transaction T1 if T2 is either a collaborator of T1 or a collaborator of a cohort of T1. In some embodiments, the set of cohorts of the given transaction may be identified, e.g., by recursively traversing linked shared data object records and collaborator records, and the transactional states of each of the transactions in the cohort set may be examined to determine whether the given transaction is to be aborted or committed. Transactions in a cohort set may either all be committed or all be aborted; e.g., if the transactional-memory manager determines that one transaction requesting commit is to be committed, all the cohorts of that transaction may also be committed; and if the transactional-memory manager determines that one transaction of a cohort set is to be aborted, all transactions of the cohort set may be aborted.
In determining whether to commit or abort a given transaction requesting a commit, the transactional-memory manager may decide to commit if any one of the cohorts of the requesting transaction is already in a committed state, or if each of the cohorts is in a committing state (indicating that the cohort is ready to commit) and has not been aborted. If any of the cohorts is in an aborted state, the given transaction and its cohorts may all be aborted. If a cohort is in an active state, indicating that the cohort has neither indicated that it is ready to commit nor been aborted, the transactional-memory manager may in some embodiments wait until the active transaction has changed state before making the commit/abort decision. If the active cohort aborts, the given transaction may be aborted; if instead the cohort changes to a committing state, the transactional-memory manager may commit the given transaction eventually, if all the other cohorts also reach the committing state. In some embodiments, before a transaction commits, the transaction logically “seals” any objects for which it has requested synchronization, preventing any additional transactions not already identified in the collaborator records of those objects from making changes to the objects. In addition to being used for commits, such a seal may, for example, serve as a delimiter between successive consistent versions of the data of the shared data object: e.g., when a shared data object has been sealed, no further requests for synchronization on the shared data object may be honored until the cohort set linked to the sealed shared data object either commits or aborts.
BRIEF DESCRIPTION OF THE DRAWINGS
While the invention is susceptible to various modifications and alternative forms, specific embodiments are shown by way of example in the drawings and are herein described in detail. It should be understood, however, that drawings and detailed description thereto are not intended to limit the invention to the particular form disclosed, but on the contrary, the invention is to cover all modifications, equivalents and alternatives falling within the spirit and scope of the present invention as defined by the appended claims.
DETAILED DESCRIPTION OF EMBODIMENTS
In order to resolve some types of conflicts between transactions 155, transactional-memory manager 105 may utilize services provided by a contention manager 185, as described below in further detail. It is noted that the term “transaction”, as used herein, may in some embodiments refer to lower-level operations than those typically managed by concurrency control algorithms within database management systems (although even in such embodiments, low-level operations of threads of a database management system process may be coordinated using a transactional-memory manager 105). In other embodiments, the techniques described herein may also or instead be used to manage concurrent database transactions. In some embodiments, transactional-memory manager 105 may be implemented entirely in application or user space, while in other embodiments, portions or all of transactional-memory manager 105 may be embedded within a virtual machine process (e.g., a Java™ virtual machine (JVM)) and/or within an operating system (which may be configured to implement virtual memory). In one embodiment, transactional-memory manager 105 may be deployed as a shared library (e.g., within one or more Java™ archive or “jar” files at a JVM). Transactional-memory manager 105 may rely on atomic operation support provided by an underlying operating system or virtual machine implementation in some embodiments, e.g., to atomically change transaction state. Part or all of the functionality of the transactional-memory manager 105 may be implemented by a plurality of components in some embodiments, e.g., by the threads that participate in the transactions.
The term “transactional object”, as used herein, may refer to an object that contains or encapsulates an ordinary object that may be declared or defined in a programming language (e.g., any cloneable object defined in the Java™ programming language may be encapsulated in some embodiments). The encapsulation of the ordinary object within a transactional object may allow transactional-memory manager 105 to support concurrent programming without the use of locks, as described below in further detail. A transactional object may be designated as either unsynchronized or synchronized, e.g., using programming language declarations. In one implementation utilizing an object-oriented language such as the Java™ programming language, for example, a transactional object that is unsynchronized may be declared as an instance of a TMObject class, and a synchronized transactional object may be declared as an instance of a TMSynchronizer class that is a subclass of the TMObject class. The term “unsynchronized object” and “unsynchronized transactional object” may be used synonymously herein, and the terms “synchronized object” and “synchronized transactional object” may also be used synonymously herein. Modifications made to the ordinary object encapsulated within an unsynchronized object 109 by a transaction 155 may not be visible to other transactions until the modifying transaction commits. Modifications made to the ordinary object encapsulated within a synchronized object 107 by a transaction 155 that has registered for synchronization on the synchronized object may be visible, prior to the modifying transaction's commit, only to other transactions that have also registered for synchronization on the same synchronized object 107. Thus, some transactions 155 may share updated data with other transactions prior to committing, thereby relaxing the isolation requirement typical of transactional systems; other transactions may restrict their updates to unsynchronized transactional objects 109, and thus maintain strict isolation. It is noted that while the isolation requirement may be relaxed for transactions that access synchronized transactional objects 107, the atomicity requirement may not be relaxed. All transactions 155 are atomic: a transaction is either committed (i.e., its changes are made permanent) or aborted (its changes are discarded).
Transactional-memory manager 105 may be configured to provide a low-level application programming interface (API) to support coordination of accesses to shared data without using locks in some embodiments. For example, a beginTransaction( ) interface may be provided to allow programmatic indication of the initiation of a transaction 155, and a commitTransaction( ) interface may be provided to allow programmatic indication of a request to commit a transaction 155 (i.e., to make permanent any data modifications—performed between the invocations of the beginTransaction( ) and commitTransaction( ) interfaces). Transactional-memory manager 105 may in some embodiments provide different interfaces to allow access to unsynchronized and synchronized objects 109 and 107 respectively: e.g., in one implementation, an open( ) interface may be used to request access to unsynchronized objects 109, and a synchronize( ) interface may be used to request (synchronized) access to synchronized objects 107. In some embodiments, a single interface (such as synchronize( )) to access a synchronized object 107 may serve two logically distinct purposes: first, the requesting transaction 155 may be registered as a synchronizing transaction with respect to the object 107, and second, a pointer to the encapsulated data of the synchronized object 107 may be returned to the requesting transaction 155. In some embodiments, separate interfaces may be provided for registration and for obtaining access to the encapsulated data. In order to ensure that the relaxation of the isolation requirement does not lead to inconsistent data or incorrect operations, all the transactions 155 that synchronize concurrently on a given synchronized object 107 are either committed, or all are aborted. Further details on how the set of transactions that are to be committed or aborted together is enumerated, and how it is determined whether to commit or abort the set, are provided below. In some embodiments, one or more of the transactions that synchronize on a synchronized object 107 may not actually modify the encapsulated data of the synchronized object; in some such cases, for example, a synchronized object 107 may serve simply as a barrier synchronization mechanism without requiring any uncommitted updates to be shared across transactions. In one embodiment from among a set of transactions 155 that have synchronized on a given object 107, one or more transactions may modify the encapsulated object, and one or more other transactions may read the uncommitted modifications to the object (thus implementing, for example, a producer-consumer buffer). It is noted that in embodiments where multiple transactions modify the encapsulated object, the modifying transactions may implement a protocol (e.g., a locking protocol) to ensure that they do not make inconsistent modifications to the encapsulated object. In some such embodiments, transactional-memory manager 105 may implement or participate in such a protocol, and in other embodiments the protocol may be implemented independently of transactional-memory manager 105.
A given transaction 155 may access one or more unsynchronized objects 109, one or more synchronized objects 107, or a combination of synchronized and unsynchronized transactional objects in various embodiments. Transactions that register to synchronize on at least one synchronized object 107 may be termed “synchronizing” transactions herein, and transactions that do not attempt to register for any synchronized objects 107 may be termed “nonsynchronizing” transactions. At any given time, the set of transactions that are active in system 100 may include any combination of nonsynchronizing and synchronizing transactions (e.g., only nonsynchronizing transactions, only synchronizing transactions, or both nonsynchronizing transactions and synchronizing transactions). Transactional-memory manager 105 may be configured to manage interactions and conflicts among the nonsynchronizing and synchronizing transactions.
In one embodiment, transactional-memory manager 105 may be configured to implement a synchronized object 107 using a data structure termed a synchronized object locator (SOL) 110. (
It is noted that locator objects similar in concept to SOLs may also be used to implement unsynchronized objects 109 in some embodiments; however, since only a single transaction 155 may be allowed to modify an unsynchronized object 109 at a time, the locator objects of unsynchronized objects may only need to identify a single “opener” transaction that most recently opened the unsynchronized object for access. In some embodiments, e.g., to simplify implementation, SOLs 110 may also be used for unsynchronized objects 109; in such an embodiment, a C-List 115 for an unsynchronized object 109 may not need to include more than one transaction entry 145.
In addition to the C-List 115, an SOL 110 may also comprise an object state indicator 120 (which may also be termed a “seal” indicator herein), that may be used, for example, to ensure that additional transactions do not succeed in registering for object 107 (and thereby potentially changing the encapsulated data of object 107) after one of the transactions issues a commit request. A “seal” may be placed on a synchronized object 107 (e.g., by setting object state indicator 120 to a particular value) to act as a delimiter between different consistent versions of the encapsulated data; a transaction 155 that has synchronized on an object 107 may be required to seal the object (or to ensure that the object 107 has been sealed by another collaborating transaction) before it commits. After an object 107 has been sealed, the modifications made by the transactions in the C-List 115 of that object may not be visible to any other transaction until the transactions on the C-List 115 all commit. In some implementations, the object state indicator may be implemented as a special entry in a C-List 115. For example, in one implementation, the C-List may be implemented as an ordered list of records, and to seal a synchronized object 107, a seal record may be placed at the head of the C-List when a transaction from among the transactions identified in the C-List 115 issues a commit request and the synchronized object is not already sealed.
In some embodiments, two or more versions of the encapsulated data of the synchronized object 107 may be maintained, such as an “old” version 160 and a “new” version 150, as shown in
The transactions 155 identified in a given C-List 115 may be referred to herein as “collaborators” or “collaborating transactions” with respect to the corresponding synchronized object 107. In some embodiments, a transaction 155 may register to synchronize on more than one synchronized object 107, each of which may have its own C-List 115. As noted above, collaborators of a given synchronized object 107 are not isolated from one another, since they can see each other's uncommitted changes to the encapsulated data. If a transaction 155A aborts, then 155A's collaborators must also be aborted to prevent the observation of partial effects of the aborted transaction by noncollaborating transactions. This implies that collaborators of 155A's collaborators should also be aborted. For any given transaction 155, the term “cohort” may be used herein to refer to another transaction that is either a collaborator of the transaction 155, or a collaborator of a cohort of transaction 155. (In terms of formal mathematics, the cohort relationship may be an equivalence relation, i.e., it may be reflexive, symmetric and transitive.)
Transactional-memory manager 105 may be configured to identify the set of cohorts of any given synchronizing transaction 155 that requests a commit in some embodiments. As described below in further detail, in many cases transactional-memory manager 105 may not need to identify the complete cohort set before making a commit/abort decision (e.g., as soon as a cohort in an aborted state is found, the enumeration of additional cohorts may no longer be needed); however, in principle at least, the entire cohort set of a transaction may have to enumerated.
Transactional state information may be maintained for each transaction, e.g., in state indicator 325 within the corresponding transaction descriptor 310.
It is noted that in some embodiments, a simpler set of state transitions may be implemented for nonsynchronizing transactions 155 (i.e., transactions 155 that do not synchronize on any synchronized objects 107). The COMMITTING state may not be required for such transactions in some embodiments, since nonsynchronizing transactions may not need to wait for any other transactions before being committed/aborted. Thus, nonsynchronizing transactions may change state atomically from the ACTIVE state to the COMMITTED state or from the ACTIVE state to the ABORTED state in such embodiments.
Similarly, synchronized object S2 may be created to encapsulate and provide synchronized access to another object. In some embodiments, the encapsulated object (e.g., “ti” in the above code example) may be required to be “cloneable” (i.e., the underlying programming language may be required to provide support for duplication or cloning of the encapsulated object). In some implementations, the ability to atomically duplicate the object may be needed to manage the new version 150 and old version 160 of the encapsulated object's data during commits and aborts. After synchronized objects S1 and S2 have been created, transaction T1 may be started (block 502), e.g., by a thread invoking a beginTransaction( ) API. T1 may enter the ACTIVE state in block 502. T1 may subsequently register to synchronize on S1 (block 504) and S2 (block 506). Meanwhile, transaction T2 may start (block 522) and register for synchronization on S1 (block 524), thus becoming a collaborator of T1 with respect to S1. Similarly, transaction T3 may start (block 542) and may register as a collaborator of T1 with respect to S2 (block 544). Thus, T1, T2 and T3 may form a cohort set with respect to S1 and S2. Each of the transactions T1, T2 and T3 may begin independently of the others, and may register for synchronization at desired synchronized objects independently of the others.
Each transaction may then perform data accesses on the synchronized objects (and on any other objects accessed within the scope of the respective transactions). When T1's data accesses (block 508) complete, T1 may be ready to commit (block 510), and may, for example, atomically change state from ACTIVE to COMMITTING, e.g., by invoking a commitTransaction( ) API. Meanwhile, T2 may still be performing its data accesses and computations (block 526), and may still have access to as-yet-uncommitted changes to S1 made by T1. Also, T3 may still be performing its data accesses and computations (block 544), and may be able to read uncommitted changes to S2 performed by T1. T1 may have to wait for its cohorts T2 and T3 to complete their operations and enter the COMMITTING state (block 512) before T1 commits. When T2 changes to the COMMITTING state (block 528), T1 may “seal” S1 (block 514) (e.g., by modifying object state indicator 120 for S1), indicating that no other transactions are to be allowed to synchronize on S1 until T1 and T2 commit or abort. (The dotted arrow from block 528 to block 514 indicates the dependency of T1 on T2's reaching the COMMITTING state before T1 may seal S1.) Similarly, when T3 changes to the COMMITTING state (block 546), T1 may “seal” S2 (block 516). Having sealed all the synchronized objects 107 for which it is registered (i.e., S1 and S2), T1 may change from the COMMITTING state to the COMMITTED state, indicating a successful commit operation (518). In some embodiments, e.g., a value indicating success may be returned to T1 from the commitTransaction( ) call to indicate the change to the COMMITTED state. T2 and T3 may also be committed (i.e., they may change state to the COMMITTED state) (blocks 530 and 550 respectively). It is noted that while in the depicted time sequence shown in
It is noted that in some embodiments, the “sealing” of a synchronized object 107 by T1 may be performed at a different time relative to the waiting operations of block 512. For example, in the depicted embodiment, if a transaction T4 attempts to register for S1 (or S2) after T1 enters the COMMITTING state in block 510, and before T2 or T3 enter the COMMITTING state, T4 may be allowed to successfully register, and may be included in the C-List 115 for S1 (or S2). T1 may then have to wait for T4 to also enter the COMMITTING state before S1 (or S2) is sealed, thus potentially lengthening the time window in which other transactions T5, T6, etc. may also join the cohort set. In another embodiment, T1 may instead seal S1 and S2 as soon as T1 enters the COMMITTING state, while T2 and T3 are still in the ACTIVE state (e.g., operations of block 514 and 516 may be performed prior to the wait of block 512). By sealing objects S1 and S2 early, T1 may reduce the time that it may have to wait for cohorts; however, new transactions that request registration for S1 or S2 may have to wait longer to register (since they may have to wait until the objects are unsealed). In some embodiments, the sealing operation may be independent of, and performed prior to, any transaction's being ready to commit. For example, an API specifically for sealing (e.g., setSeal( )) may be supported in some embodiments, allowing a transaction 155 to “freeze” the cohort set at any time after the transaction 155 registers, e.g., by preventing any additional transactions from registering until the initiator of the freeze commits or aborts.
In one example use case, transactions such as T1, T2 and T3 of
If, on the other hand, the object 107 was already in a sealed state when the registration request is received (as detected in block 610), the transactional-memory manager 105 may be configured to unseal the object 107 (block 620), e.g., either by waiting until the cohorts associated with the object 107 commit/abort, or by actively aborting the cohorts. Unsealing operations corresponding to block 620 of
The object 107 for which the registration request was received may eventually be unsealed, either as a result of a commit by one or more cohorts, or as a result of an abort by one or more cohorts. In some embodiments, the object 107 may remain sealed until all the cohorts' states change to the COMMITTED state or the ABORTED state, while in other embodiments, the object may be unsealed as soon as any one of the cohorts' state changes to the COMMITTED state or the ABORTED state. In one specific embodiment, the object 107 may be unsealed as soon as it becomes evident that the cohorts are going to be committed/aborted—e.g., a cohort may be deemed to be “logically committed” or “effectively committed” if all its cohorts are in the COMMITTING state and all the synchronized objects 107 of the cohort set are sealed, and the object may be unsealed as soon as a cohort is found to be effectively committed. When the commit/abort occurs, the object state indicator 120 of the SOL 110 for the object 107 may be updated to reflect the unsealed state.
A new SOL 110 may be created for the requesting transaction after the object 107 is unsealed (block 625), e.g., with a C-List 115 that identifies only the requesting transaction. The new and old data pointers 125 and 130 (shown in
Returning now to
Claims
Illustrations


















