Spring框架中我們經(jīng)常使用 @Transactional 注解來(lái)做事務(wù),但是事務(wù)并不能保證有效性;
以下是我遇到的問(wèn)題,不一定完全正確,可以做個(gè)參考:
在一個(gè)類上標(biāo)記了 @Transactional,使得該類下的所有方法都以默認(rèn)的事務(wù)方式運(yùn)行。
在一個(gè)方法中分別調(diào)用這個(gè)方法:分別對(duì)這個(gè)方法進(jìn)行try catch異常,防止因?yàn)楫惓;貪L所有數(shù)據(jù)
public class TestAnother{
在正常情況下,這個(gè)方法是沒有問(wèn)題的,但是在線上的時(shí)候,由于請(qǐng)求量較大,也就是我們常說(shuō)的高并發(fā)環(huán)境:
在B方法中,假如我們有一句SQL:delete from users where status = 'test’;
在users表中給status加了一個(gè)索引。
問(wèn)題來(lái)了:
在一般情況下,由于是串行邏輯,所以不會(huì)有影響。
但是在高并發(fā)情況下,由于我們需要delete語(yǔ)句,需要行級(jí)鎖,因?yàn)閟tatus是一個(gè)非聚集索引,所以需要給范圍性的數(shù)據(jù)上行級(jí)鎖,也就是利用了 next-key lock。(InnoDB實(shí)現(xiàn)的RR通過(guò)next-key lock機(jī)制避免了幻讀現(xiàn)象。這部分我也不是特別確定),而在并發(fā)環(huán)境下,由于上一個(gè)方法的鎖未釋放,下一個(gè)方法又進(jìn)來(lái)了。
比如: 第一個(gè)線程進(jìn)來(lái)的時(shí)候需要?jiǎng)h除0-10的數(shù)據(jù),這時(shí)候加鎖加到了第5個(gè),而第二個(gè)線程這個(gè)時(shí)候也進(jìn)來(lái)了,比如隨機(jī)加了其他的鎖,這時(shí)候也需要拿5的鎖,但是沒有拿到,需要等待線程1釋放鎖,而第一個(gè)線程可能剛好需要第二個(gè)線程的隨機(jī)鎖,導(dǎo)致兩個(gè)線程互相等待拿鎖,從而導(dǎo)致死鎖。
話說(shuō)回來(lái),如果 @Transactional 遇到死鎖會(huì)怎么樣呢?
我在本地模擬了死鎖的條件,本地SQL執(zhí)行了一個(gè)start Transactional,但是一直不提交。
用POSTMAN在線上發(fā)了一個(gè)請(qǐng)求,線上的請(qǐng)求中雖然A方法執(zhí)行完成了,但是卡在了B方法遲遲拿不到鎖,最后導(dǎo)致了獲取鎖超時(shí)。下面是通過(guò)數(shù)據(jù)庫(kù)查看的最近一次死鎖的信息:
===================================== 2019-09-07 06:28:38 7fe01c931700 INNODB MONITOR OUTPUT ===================================== Per second averages calculated from the last 24 seconds srv_master_thread loops: 8912 srv_active, 0 srv_shutdown, 516445 srv_idle srv_master_thread log flush and writes: 524528 OS WAIT ARRAY INFO: reservation count 24855 OS WAIT ARRAY INFO: signal count 25085 Mutex spin waits 14574, rounds 408115, OS waits 13345 RW-shared spins 10346, rounds 338033, OS waits 11257 RW-excl spins 216, rounds 7866, OS waits 240 Spin rounds per wait: 28.00 mutex, 32.67 RW-shared, 36.42 RW-excl Purge done for trx's n:o < 690050 undo n:o < 0 state: running but idle LIST OF TRANSACTIONS FOR EACH SESSION: ---TRANSACTION 0, not started MySQL thread id 18225, OS thread handle 0x7fe01c931700, query id 686481 172.17.0.1 root init show engine innodb status ---TRANSACTION 690050, not started MySQL thread id 18223, OS thread handle 0x7fdf6331b700, query id 686305 172.17.0.1 root ---TRANSACTION 690060, not started MySQL thread id 18203, OS thread handle 0x7fe01cabd700, query id 686456 172.17.0.1 root ---TRANSACTION 690058, ACTIVE 32 sec inserting mysql tables in use 1, locked 1 LOCK WAIT 5 lock struct(s), heap size 1184, 3 row lock(s), undo log entries 2 MySQL thread id 18202, OS thread handle 0x7fe01c1b7700, query id 686341 172.17.0.1 root update INSERT INTO spot_account_flows (flowType, refType, refId, fromUserId, fromAccountId, toUserId, toAccountId, currency, amount, description, createdAt) VALUES ('TRADE_CLEAR', 'CLEARING', 0, 100000, 102950, 100000, 108015, 'BTC', 1, '', 1567837686558) ------- TRX HAS BEEN WAITING 32 SEC FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 643 page no 97 n bits 144 index `PRIMARY` of table `ex`.`spot_account_flows` trx id 690058 lock_mode X insert intention waiting Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0 0: len 8; hex 73757072656d756d; asc supremum;; ---TRANSACTION 690056, ACTIVE 36 sec 67 lock struct(s), heap size 13864, 8195 row lock(s), undo log entries 23 MySQL thread id 18224, OS thread handle 0x7fdf63bdf700, query id 686331 172.17.0.1 root I/O thread 0 state: waiting for completed aio requests (insert buffer thread) I/O thread 1 state: waiting for completed aio requests (log thread) I/O thread 2 state: waiting for completed aio requests (read thread) I/O thread 3 state: waiting for completed aio requests (read thread) I/O thread 4 state: waiting for completed aio requests (read thread) I/O thread 5 state: waiting for completed aio requests (read thread) I/O thread 6 state: waiting for completed aio requests (write thread) I/O thread 7 state: waiting for completed aio requests (write thread) I/O thread 8 state: waiting for completed aio requests (write thread) I/O thread 9 state: waiting for completed aio requests (write thread) Pending normal aio reads: 0 [0, 0, 0, 0] , aio writes: 0 [0, 0, 0, 0] , ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0 Pending flushes (fsync) log: 0; buffer pool: 0 4606 OS file reads, 96239 OS file writes, 65171 OS fsyncs 0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s ------------------------------------- INSERT BUFFER AND ADAPTIVE HASH INDEX ------------------------------------- Ibuf: size 1, free list len 0, seg size 2, 22 merges insert 29, delete mark 421, delete 364 insert 0, delete mark 0, delete 0 Hash table size 276671, node heap has 26 buffer(s) 0.00 hash searches/s, 0.00 non-hash searches/s Log sequence number 668395471 Log flushed up to 668395471 Pages flushed up to 668395471 Last checkpoint at 668395471 0 pending log writes, 0 pending chkp writes 33363 log i/o's done, 0.00 log i/o's/second Total memory allocated 137363456; in additional pool allocated 0 Dictionary memory allocated 959373 Pending writes: LRU 0, flush list 0, single page 0 Pages made young 3270, not young 25362 0.00 youngs/s, 0.00 non-youngs/s Pages read 3915, created 13555, written 48527 0.00 reads/s, 0.00 creates/s, 0.00 writes/s Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000 Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s LRU len: 7137, unzip_LRU len: 0 I/O sum[16]:cur[0], unzip sum[0]:cur[0] 0 queries inside InnoDB, 0 queries in queue 0 read views open inside InnoDB Main thread process no. 1, id 140600574822144, state: sleeping Number of rows inserted 385622, updated 20256, deleted 79, read 13788081 0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.50 reads/s ---------------------------- END OF INNODB MONITOR OUTPUT ============================
而此時(shí)查數(shù)據(jù)庫(kù)發(fā)現(xiàn),A方法執(zhí)行的事務(wù)也被回滾了。
原因就是:因?yàn)楫?dāng)前線程被數(shù)據(jù)庫(kù)死鎖卡在了獲取鎖的情況下,當(dāng)前請(qǐng)求不能完全結(jié)束,導(dǎo)致 A 方法的事務(wù)不能提交,最后拋出的異常雖然是B方法的,但是A方法由于整個(gè)方法未能正確結(jié)束,所以事務(wù)未能正確提交,而MYSQL事務(wù)的默認(rèn)超時(shí)時(shí)間是50s。
可以通過(guò)此命令 show variables like 'innodb_lock_wait_timeout';
也就是說(shuō)如果50s未能commit事務(wù),那么當(dāng)前事務(wù)將被自動(dòng)回滾。
這也就導(dǎo)致了為什么A方法并沒有報(bào)異常。
說(shuō)到底導(dǎo)致了A方法沒有異常卻回滾了是因?yàn)榉?wù)超時(shí)了。
解決方案:
1.數(shù)據(jù)庫(kù)事務(wù)默認(rèn)為自動(dòng)提交,我們可以手動(dòng)設(shè)置為手動(dòng)提交。
2.方法拆分,使其不在一個(gè)線程內(nèi)即可,這樣A方法就不會(huì)因?yàn)锽方法超時(shí)而回滾。
3.update或者insert或者delete語(yǔ)句使用主鍵索引,這樣可以避免 next-key lock 使其產(chǎn)生范圍鎖。這樣就不會(huì)產(chǎn)生排他鎖而導(dǎo)致線程之間死鎖。
因?yàn)閷?duì)MYSQL的了解并沒有那么深入,錯(cuò)誤歡迎指出。
|