109 | Deadlocks

I will use English here because I am going to copy this to MSDN later on.

So.

4 out of 5 projects, which I have been involved in during last couple of years, used some kind of loop to retry in the event of database deadlock. Which made me thinking: "Why do we have to repeat this over and over again in the client code?"

It seems that we have a classic case of successful problem externalization. Seems like database architects decided not to deal with this [difficult without any doubt] issue, making it cient problem.

Both most popular database engines (Oracle, Sql Server) return deadlock exception as soon as it is detected. Which in typical scenario will be sooner than the timeout [duh! otherwise it would be timeout exception, not a deadlock one]. Default time frames for Sql Server are 5 seconds for deadlock, 30 seconds for timeout. Which means the server could try 6 times to rerun deadlocked transaction before giving up. This is a simplest thing to do. Not very smart though, but it would already help.

But, to an outsider (me), it seems that server could do more. Server detects deadlocks by walking the list of locks held by suspicious transaction[s]. Instead of rolling back all the way, it could roll back to the point just before the first deadlocked resource was locked. Wait a little (meanwhile the other transaction will go through; it can even be tracked) and retry. I am sure this is all much more complicated internally, but I don't see anything impossible here.

Anyway, even simple retry of the whole transaction would help. I am sure that Katmai is already too late in the cycle to add new feature, but it should be at least discussable for vNext + 1 version. What do you guys think?

Flat | Top-Level Comments Only

From:

mr-zu.livejournal.com

Unfortunately this will not work (unless transaction isolation level is READ UNCOMMITTED) because of possibility of Dirty read, Nonrepeatable read or Phantom read. In case of deadlock the only way is to rollback entire transaction.
The problem of deadlocks is a problem of system design. If all transactions are serialized - ALL access T1 first and T2 second then there will be no deadlocks.
I have couple enterprise systems for a few years: no deadlocks so far.

From:

109.livejournal.com

Could you please elaborate the effect of dirty reads on the possibility of partial rollback? I don't see the problem. You can use russian if you want :)

It is arguable whether it is easier to build deadlock-free or deadlock-tolerant solutions. But it's not a point here. Both are the efforts to solve inherently server problem that was externalized because seemed too difficult at the time.

From:

mr-zu.livejournal.com

Data could be modified by committed transaction. Partially rolled back transaction consistency is based on that data, so during this transaction first part of it run with one set of data and the second part of it - with modified or deleted or added.

I type russian slow :)

From:

109.livejournal.com

this is true if original transaction requested repeatable read. but in this case the technique used to ensure repeatable reads (snapshot) will work as well, and prevent seeing new data even after partial rollback.

on the other hand, if the original transaction is "read committed", it is fine to see new committed data after rollback; it [different data in different parts of the same transaction] could happen even without the rollback since we didn't request this level of consistency in the first place.

From:

mr-zu.livejournal.com

From:

109.livejournal.com

True, if Snapshot isolation is enough for transaction. But inserted and deleted records of first transaction might create logical inconsistency; snapshot can’t show and can’t prevent such records. So if second transaction depends on this data then you need something more restrictive.

ни одна из транзакций ещё ничего не закоммитила в момент дедлока, так что ни о какой зависимости одной транзакции от другой речь идти не может.

If your business logic allows Nonrepeatable and Phantom reads then automatic retry of transaction in case of deadlock is possible, but I think of it more as exception.

и nonrepeatable и phantom reads - это артефакты, возникающие в момент, когда одна транзакция коммитит данные, с которыми работает другая транзакция. поэтому тоже совершенно мимо кассы.

давайте вы напишете что-нибудь более конкретное - какой isolation level вы имеете в виду, и какие транзакции, для примера, в которых partial rollback создаст проблемы, иначе отсутствующие. ну чисто чтобы рассеять мои сомнения в том, что вы понимаете, что говорите.

From:

109.livejournal.com

oh, and also please tell explicitly what are you discussing in any given moment - partial rollback or auto retry. in the previous comment you were discussing partial rollback, so I [incorrectly] assumed you are still on that track and answered in that context.

in case of auto retry it is quite simple. any logic that needs consistency should be inside the transaction, otherwise consistency can not be guaranteed. so if we rerun the whole transaction, there is nothing to worry about.

again, it would be very useful to see some concrete examples.

From:

mr-zu.livejournal.com

I spent some time and create example to show why partial rollback is not a clean solution to handle deadlocks.

Transaction isolation level for both transactions are READ COMMITTED

Transaction 1 Tr1

BEGIN TRANSACTION

DECLARE @var1 , @var2

1. SELECT @var1 = COUNT(*) FROM TabA
…
2. UPDATE TabA SET WHERE id = 123
…
3. SELECT @var2 = COUNT(*) FROM TabA
…
4. UPDATE TabB SET WHERE id = 456

COMMIT TRANSACTION

Transaction 2 Tr2

BEGIN TRANSACTION

1. UPDATE TabB SET WHERE id = 456
…
2. INSERT INTO TabA (id) VALUES (124)

COMMIT TRANSACTION

Lets examine the execution.

Tr1 starts
Tr1 step 1: shared lock held for duration of statement and released short after statement finished, value of @var1 is N
Tr1 step 2.
Tr2 starts
Tr2 step 1
Tr1 step 3 value of @var2 is N
Tr1 step 4: lock wait (resource held by Tr2 step 1
Tr2 step 2: lock wait (resource held by Tr1 step 2, because records with id 123 and 124 are on same page), deadlock.

At this moment value of @var1 and @var2 are same

Lets assume now that server can partially rollback transaction just before locks, wait some time for another transaction to commit and then roll forward.

Tr1 rolled back just before Tr1 step 2
Tr2 committed
Tr1 rolled forward
Tr1 step 2.
Tr1 step 3 value of @var2 is N +1
Tr1 step 4

At this moment value of @var1 and @var2 are different

Tr1 committed

My point is if there is no deadlock (and partial rollback) value of @var2 will be always same as value of variable read just before UPDATE statement (Tr1 step 2) guaranteed by TabA having exclusive lock acquired at step 2 till the end of transaction.

From:

109.livejournal.com

ok, now I see what are you talking about. I believe we already analyzed this case. the equality of var1 and var2 is not guaranteed by the semantics of Tr1, because Tr2 or any other transaction inserting into TabA can commit right after "select @var1".

if you, on the other hand, ensure that @var1 won't change by using "select @var1 ... with (xlock, holdlock)", then partial rollback will work just fine.

I believe it is not possible to find the case where partial rollback will introduce incosistency that was not there before, but I will enjoy playing this game (you provide the example, I counter it) as long as you do.

From:

yakov-sirotkin.livejournal.com

Если база не справляется с запросом, то ненужно к ней приставать. но можно придумать способ кардинально улучшить ситуацию, чтобы таких проблем не возникало.

From:

ivan-gandhi.livejournal.com

Да речь как раз о том, чтобы базу починить, чтоб не была такой дурой.

From:

109.livejournal.com

Яков, we love you!

From:

ivan-gandhi.livejournal.com

Right; there was a neat article about detecting the precise cause of deadlock (have to store lock trees); and it seems like this should not be a problem at all these days.

From:

piggymouse.livejournal.com

Aren't you yourself in a position to initiate such a discussion within the product team?

Or is it not allowed to mention this? ;)

From:

109.livejournal.com

I _will_ initiate, but not exactly "within the product team", since I am not sql server.

From:

piggymouse.livejournal.com

You are not a SQL server, indeed.

From:

109.livejournal.com

that's how we talk up here, pal :)

I am local search, you know.

Flat | Top-Level Comments Only

S	M	T	W	T	F	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

Latched, pinned, and marked dirty

Автор эпических сказаний

Deadlocks

(no subject)

(no subject)

(no subject)

(no subject)

(no subject)

(no subject)

(no subject)

(no subject)

(no subject)

(no subject)

(no subject)

(no subject)

(no subject)

(no subject)

(no subject)

(no subject)

(no subject)

Profile

March 2019

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags