[mysql-dde] Re: Fw: P/F/U Keys in LDD

  • From: "Fabricio Mota" <fabricio.oliveira@xxxxxxxxxxxxxxxx>
  • To: <mysql-dde@xxxxxxxxxxxxx>
  • Date: Thu, 22 Dec 2005 10:07:31 -0200

Hey Peter...

I do not agree with all those "perfect" judgement you set. That's because 
perfect refers to very specially ideal conditions...
The ideal solution would have "perfect" in all aspects. But, for example:

Internal Communication: the servers will never communicate themselves.
Auxiliar Data: the servers will never use neither a bit to store management 
data.
Response time: absolutely zero.
Consistence and Integrity: the data will never get to a 
inconsistent/incoherent state, neither if disk checksums corrupts 
(0,00000000% of faults). This sounds like
a ideal world, don't it??? :)
And so on...

So I suggest we try to use "Best", when the solution sounds like a 
invincible solution. Don't you? :)

Hey, I suggest also we use a resumed matrix where we can compare all methods 
in a only page, such as the attached file. The values filled is 
aproximatelly = (Mysuggestion + YourSuggestion)/2
Feel free to change it.

I've been myself free to change the name of some values. Look it.

FM

----- Original Message ----- 
From: "Peter B. Volk" <PeterB.Volk@xxxxxxx>
To: "Fabricio Mota" <fabricio.mota@xxxxxxxxx>; "Fabricio Mota" 
<fabricio.oliveira@xxxxxxxxxxxxxxxx>
Cc: <mysql-dde@xxxxxxxxxxxxx>
Sent: Tuesday, December 20, 2005 9:30 PM
Subject: [mysql-dde] Re: Fw: P/F/U Keys in LDD


> Content-Type: text/plain;
> charset="iso-8859-1"
> Content-Transfer-Encoding: quoted-printable
> Hey I've added the descision matrix to all options. Feel free to add =
> your x-es (please use a different color)
>
> Peter
>  ----- Original Message -----=20
>  From: Fabricio Mota=20
>  To: Peter B. Volk=20
>  Sent: Tuesday, December 20, 2005 3:39 AM
>  Subject: Re: [mysql-dde] Re: Fw: P/F/U Keys in LDD
>
>
>  Hey,
>
>  You forgot one (despite I don't agree with it no much :) )
>
>  FM
>
>  =20
>  2005/12/19, Peter B. Volk <PeterB.Volk@xxxxxxx>:=20
>    Hey,
>
>    Here is an overview of mothods that came up during the discussion. =
> Fell free
>    to comment. My fafourite is still Suggestion 3.=20
>
>    Peter
>
>
>
>    ----- Original Message -----
>    From: "Fabricio Mota" <fabricio.oliveira@xxxxxxxxxxxxxxxx>
>    To: < mysql-dde@xxxxxxxxxxxxx>
>    Sent: Thursday, December 15, 2005 2:09 PM
>    Subject: [mysql-dde] Re: Fw: P/F/U Keys in LDD
>
>
>    > Correction: insertion time rate verigh HIGH (what I mean)
>    >
>    > ----- Original Message -----=20
>    > From: "Fabricio Mota" <fabricio.oliveira@xxxxxxxxxxxxxxxx>
>    > To: <mysql-dde@xxxxxxxxxxxxx >
>    > Sent: Thursday, December 15, 2005 11:01 AM
>    > Subject: [mysql-dde] Re: Fw: P/F/U Keys in LDD
>    >
>    >
>    > > Content-Type: text/plain;
>    > > charset=3D"iso-8859-1"
>    > > Content-Transfer-Encoding: quoted-printable=20
>    > > I was thinking a lot about this situation... So, I've concluded =
> that:
>    > >
>    > > The problem:
>    > >
>    > > My desire is to ensure a situation where a LDD table with a =
> unique key =3D=20
>    > > such as CPF could allow insertion of a value k without having to =
> verify
>    =3D
>    > > its uniqueness among remote servers.
>    > >
>    > > The idea would be to allow the remote servers to have complete =
> autonomy=20
>    =3D
>    > > for insertion, otherwise, they could have a insertion time rate =
> very low
>    =3D
>    > > comparated to independent local databases. This could make a bad =
> image =3D
>    > > to our product, once most of tables candidate to be LDD will =
> have unique=20
>    =3D
>    > > or primary keys.
>    > >
>    > > My conclusion:
>    > >
>    > > This solution is not possible.=3D20
>    > >
>    > > The only way to ensure total independence for unique/primary =
> keys is to=20
>    =3D
>    > > use independent visible LDD tables (that is, combining all keys =
> with the
>    =3D
>    > > LOCAL column). Consider that few people will be acceptable to =
> convert =3D
>    > > their keys combining to this column. Do you agree?=20
>    > >
>    > > Despite GI might ensure 99,9% probability to servers being =
> synchronized,
>    =3D
>    > > this 0,1% is enough to force us to verify remote servers looking =
> for =3D
>    > > active transactions with the same values, or to strike the model =
> =3D=20
>    > > consistence.=3D20
>    > >
>    > > The possible:
>    > >
>    > > As it is impossible to reduce to zero the probability of a =
> independent =3D
>    > > insertion without consistency being violated, I think what we =
> could do =3D=20
>    > > is trying to minimize this probability.
>    > >
>    > > suggestion 1
>    > >
>    > > One way I thought could reduce the probability of verifying =
> remote =3D
>    > > servers near to 50%, is to stabilish a priority chain. That is, =
> if we =3D=20
>    > > have 30 servers in a cluster, everyone will have a different =
> priority =3D
>    > > order defined, for example, with a integer value 1..30, =
> representing the
>    =3D
>    > > priority degree.
>    > >
>    > > Consider that the only situation that could have unsynchronized =
> GI about
>    =3D
>    > > a given value k is when there is an opened transaction about a =
> value k.
>    =3D
>    > > So, if two different servers will never have the same priority =
> for the =3D=20
>    > > same moment in time (priority could be changeable for servers =
> along the
>    =3D
>    > > time), a server will never have to check the value k for any =
> server with
>    =3D
>    > > lower priority.=3D20
>    > >=20
>    > > That is, for the example of 30 servers, the server numbered 30 =
> will =3D
>    > > never have to check remote servers for any insertion, but only =
> its own =3D
>    > > GI. The server 29, will only have to check server 30. Server 28, =
> only 29=20
>    =3D
>    > > and 30. Server 27 checks 28, 29 and 30, and so on.
>    > >
>    > > This probability falls to 50% because of the following. Consider =
> a n =3D
>    > > servers cluster. For 1 .. n servers:
>    > >=20
>    > > - Server n has the maximum priority);
>    > > - Server n - 1 has the second maximum priority (except n);
>    > > - Server n - 2 has the third maximum priority (except n, n - 1);
>    > > ...
>    > > - Server 1 has de lowers priority (under all).
>    > >
>    > > Considering the sum of all, the total of lowest priority (what =
> is =3D
>    > > interesting for avoiding remote checks) is such atached picture. =
> =3D=20
>    > > Graphically, it's like the area of a retangle triangle.
>    > >
>    > > One way to ensure real probabilities reducing average remote =
> access rate
>    =3D
>    > > is to stabilish priority policies to further the servers with =
> most =3D=20
>    > > probability of insert values to GI to get the podium of =
> priority.
>    > >
>    > >
>    > >
>    > >
>    > >
>    > > ----- Original Message -----=3D20
>    > > From: "Peter B. Volk" < peter.benjamin.volk@xxxxxxxxxxxxxxxxx>
>    > > To: <mysql-dde@xxxxxxxxxxxxx>
>    > > Sent: Wednesday, December 14, 2005 9:38 PM=20
>    > > Subject: [mysql-dde] Re: Fw: P/F/U Keys in LDD
>    > >
>    > >
>    > >> Hey,
>    > >>=3D20
>    > >> I've been thinking:
>    > >>=3D20
>    > >> so what about this one:=20
>    > >>=3D20
>    > >> S1 receives a query. There is in I/U on a P/U/F key index. S1 =
> =3D
>    > > evaluates the
>    > >> insert. If the Key kan be inserted (no key violation) then S1 =
> starts =3D
>    > > to=20
>    > >> sernd the GI insert to the other servers synchroniously. After =
> the =3D
>    > > majority
>    > >> of the servers have agreed on the Insert then the update is =
> commited. =3D
>    > > Else
>    > >> it is rolled back. The sending to the remote servers should be =
> done on=20
>    =3D
>    > > a
>    > >> rotation princible. so if we have S1 to S8 and S4 receives the =
> query =3D
>    > > then it
>    > >> would send the update to S5,S6,S7,S8,S1 etc. This way we would =
> avoid =3D
>    > > massive
>    > >> collision with other GI updates. Also this removes the master =
> server =3D
>    > > you are
>    > >> worried about.
>    > >>=3D20
>    > >> What do you think of that?=20
>    > >>=3D20
>    > >> Peter
>    > >>=3D20
>    > >>=3D20
>    > >> ----- Original Message -----=3D20
>    > >> From: "Fabricio Mota" < fabricio.mota@xxxxxxxxx>
>    > >> To: <mysql-dde@xxxxxxxxxxxxx>
>    > >> Sent: Tuesday, December 13, 2005 1:16 AM
>    > >> Subject: [mysql-dde] Re: Fw: P/F/U Keys in LDD=20
>    > >>=3D20
>    > >>=3D20
>    > >>> Sorry, I missed to answer one:
>    > >>> 2005/12/12, Fabricio Mota <fabricio.mota@xxxxxxxxx>:
>    > >>> >
>    > >>> > Ich bin zur=3DFCck,
>    > >>> >
>    > >>> > (hahahahaha)
>    > >>> >
>    > >>> >
>    > >>> > 2005/12/12, Peter B. Volk < PeterB.Volk@xxxxxxx>:
>    > >>> > >
>    > >>> > > ----- Original Message -----
>    > >>> > > From: Fabricio Mota
>    > >>> > > To: mysql-dde@xxxxxxxxxxxxx
>    > >>> > > Sent: Friday, December 09, 2005 2:57 AM
>    > >>> > > Subject: [mysql-dde] Re: Fw: P/F/U Keys in LDD
>    > >>> > >
>    > >>> > >
>    > >>> > > Buenas notches (that's not my language! hahaha)
>    > >>> > >
>    > >>> > >      Oh yes, I didn't think about the lake. And the =
> late-sync =3D=20
>    > > protocol
>    > >>> > > does not specifies to be water-resistent :). You've got =
> the =3D
>    > > reason.
>    > >> Maybe
>    > >>> > > it's better to make the unplugged server to rollback, and =
> the =3D=20
>    > > cluster
>    > >> to
>    > >>> > > bufferize logs from entire transaction for its future =
> recovery.
>    > >>> > >      Another thing about it I was thinking today is to =
> ensure=20
>    > >> agreement
>    > >>> > > of buffer, that is, to replicate the buffer log for all =
> servers =3D
>    > > in
>    > >> late
>    > >>> > > sinchronization. That's because of to prevent another =
> fault to =3D=20
>    > > damage
>    > >> the
>    > >>> > > global consistence. What do you think?
>    > >>> > >
>    > >>> > >      [Peter]Well we'll need somesort of a global log =
> anyway. Not =3D=20
>    > > only
>    > >>> > > for these un- and redo stuff but also for replication (see =
> next
>    > >> email).
>    > >>> > > Yes, of course. I think global logs must be another =
> starred point=20
>    =3D
>    > > to
>    > >> be
>    > >>> > > submitted to our analysis (wow, how many things!!!)
>    > >>> > >
>    > >>> > >      By the way, in general manner do you agree with late=20
>    > >>> > > synchronization?
>    > >>> > >
>    > >>> > >      [Peter]I would agree under the following conditions =
> (feel =3D
>    > > free to
>    > >>> > > disagree):=20
>    > >>> > >
>    > >>> > >      1.) RDD table can always be late synct (since they =
> are only
>    > >>> > > management tables and 90% of queries against these tables =
> are =3D=20
>    > > selects)
>    > >>> > > Yes, I agree fully. Late sync must be applied in some =
> situations =3D
>    > > to
>    > >>> > > help, and not to degrade.
>    > >>> > >=20
>    > >>> > >
>    > >>> > >      2.) There is an upper bound for late sync. This means =
> that =3D
>    > > there
>    > >> is
>    > >>> > > a some time (e.g. 10sec.) where the sync commands can be =
> delayed.=20
>    =3D
>    > > but
>    > >>> > > after this time the sync mu=3DDFt be done
>    > >>> > > I can't understand it clearly.
>    > >>> > >
>    > >>> > > [Peter] I mean that a late synch should not bee toooooo =
> late. so =3D=20
>    > > the
>    > >>> > > sync should be done with an upper bound in time. so the =
> queries =3D
>    > > are
>    > >> not
>    > >>> > > suppost to be in the late sync queue for longer than 10 =
> Sec. or =3D=20
>    > > so.
>    > >> This is
>    > >>> > > simply a livlyness property
>    > >>> >
>    > >>> >
>    > >>> Yes, this make sense. But the problem is: once late sync was =
> =3D=20
>    > > authorized,
>    > >> the
>    > >>> command was buffered and the transaction was commited, I think =
> it is =3D
>    > > very
>    > >>> necessary that the remaining buffered command being sent to =
> the =3D=20
>    > > recipient
>    > >>> before it to be reintegrated to group. Otherwise, the server =
> might =3D
>    > > come
>    > >>> inconsistent.
>    > >>>
>    > >>> The exception was if the server fell in the lake :). But if it =
> =3D=20
>    > > happen, and
>    > >>> there are no means to recovery the server state, then the DBA =
> must =3D
>    > > pull
>    > >> out
>    > >>> the server from the cluster, using *alter cluster drop =
> <dde_server> *=20
>    > >>> command.
>    > >>>
>    > >>> So, if this command being performed successfully, ALL pending =
> late =3D
>    > > sync in
>    > >>> buffer must be purged.
>    > >>>=20
>    > >>> Hey, I will star that too, I think I did not explain it in =
> spec!!!!
>    > >>>
>    > >>>
>    > >>>
>    > >>> >       3.) LDD tables are only partialy late synct. The only =
> syncing=20
>    =3D
>    > > to
>    > >> do
>    > >>> > > is the GI update right?
>    > >>> > > I think so.
>    > >>> > >
>    > >>> > >      So a late sync could only be done if there was an =
> agreement =3D=20
>    > > on
>    > >> the
>    > >>> > > u/f/p keys.
>    > >>> > > I agree. Maybe we will need to have means to ensure =
> agreement =3D
>    > > within
>    > >>> > > dynamic operations. Maybe a flag per table, with a =
> agreement =3D=20
>    > > protocol,
>    > >> I
>    > >>> > > don't know yet. But it must be a trusted mean to ensure a =
> safe =3D
>    > > late
>    > >> sync.
>    > >>> > >
>    > >>> > > [Peter]Well need to star this point until we have decided =
> how to =3D=20
>    > > do
>    > >> the
>    > >>> > > u/f/p key validation
>    > >>> >
>    > >>> >
>    > >>> > Ok. Point starred. Command executed in 0.005s. :)
>    > >>> >
>    > >>> >      AND Inserts and updates are imidiatly synct  and delets =
> are =3D
>    > > late
>    > >>> > > synct.
>    > >>> > > In true, I think late sync for I/U/D might be allowed only =
> if the=20
>    > >>> > > faultous server and the targeted server are different. =
> That's
>    > >> important to
>    > >>> > > avoid to prohibit full access to the island-server, during =
> a =3D
>    > > network
>    > >> fail,
>    > >>> > > for example, and at the same time, to ensure consistence.
>    > >>> > >
>    > >>> > >
>    > >>> > >=20
>    > >>> > >      This is because if there was a deletion the remote =
> server =3D
>    > > would
>    > >>> > > still query the server but the server would only return an =
> empty =3D
>    > > set.=20
>    > >>> > > Inserts mu=3DDFt be synct in time because there is a kind =
> of filter
>    =3D
>    > > that
>    > >> the GI
>    > >>> > > can set to optimize the number of remote server queried. =
> Same =3D=20
>    > > applies
>    > >> for
>    > >>> > > updates.
>    > >>> > >
>    > >>> > > This I also did not understand clearly.
>    > >>> > >
>    > >>> > > [Peter] imagine 2 servers. A query is set of on S1. S1 =
> needs data=20
>    =3D
>    > > from
>    > >>> > > S2. S1 needs to querie S2 to retrive the data he needs. in =
> the =3D
>    > > mean
>    > >> time S2
>    > >>> > > has executed a query that deleted exactly those rows that =
> S1 =3D=20
>    > > needs. No
>    > >> data
>    > >>> > > was effected on S1. Now S1 queries S2. Without late sync =
> for =3D
>    > > Deletes
>    > >> the
>    > >>> > > Transaction would need to be delayed because the GI on S1 =
> is not =3D=20
>    > > up to
>    > >> date
>    > >>> > > and S1 can only query S1 if the lock on GI is released. =
> With late
>    =3D
>    > > sync
>    > >> S1
>    > >>> > > can query S2 without any problems because the GI =
> modification =3D=20
>    > > only has
>    > >>> > > optimization effects.
>    > >>> >
>    > >>> >
>    > >>> > Well I think we're agreeing in a point: late sync, for the =
> most it =3D=20
>    > > is
>    > >>> > used, it must always ensure consistence. What you announced =
> is a =3D
>    > > *time
>    > >>> > window *of inconsistence.
>    > >>> >
>    > >>> > What I would propose for that is: when any operation is =
> performed,=20
>    > >>> > although a server is down or incommunicable - late-sync =
> targeted or
>    > >> not - no
>    > >>> > operation targeted to it must be allowed.
>    > >>> >
>    > >>> > The fundamentals of late-sync - in my concept, of course - =
> is that =3D=20
>    > > it
>    > >> must
>    > >>> > be an passive and *rightless* element of the transaction, =
> unable to
>    > >> decide
>    > >>> > if it could be performed or not. If he is an active element =
> (such =3D=20
>    > > as:
>    > >> delete
>    > >>> > * from LDDTable where Server =3D3D 1 ----- note that here, =
> for =3D
>    > > example,
>    > >> Server 1
>    > >>> > is an active element), then late sync is not suitable. This =
> =3D=20
>    > > violates the
>    > >> *lemma
>    > >>> > 4 *of late sync.
>    > >>> >
>    > >>> > Imagine: if in the delete query, there is an foreign key =
> related to
>    =3D
>    > > any
>    > >> of
>    > >>> > these records? If we allow to delete it (with server down), =
> it =3D
>    > > could be
>    > >> a
>    > >>> > disaster!
>    > >>> >=20
>    > >>> >
>    > >>> > --
>    > >>> > >
>    > >>> > > Sem mais,
>    > >>> > >
>    > >>> > > Fabricio Mota
>    > >>> > > Oda Mae Brown - Aprecie sem modera=3DE7=3DE3o.=20
>    > >>> > > http://www.odamaebrown.com.br
>    > >>> > > MySql-DDE discussion list
>    > >>> > > www.freelists.org/
>    > >>> > >
>    > >>> > >
>    > >>> >
>    > >>> >
>    > >>> > --
>    > >>> >
>    > >>> > Sem mais,=20
>    > >>> >
>    > >>> > Fabricio Mota
>    > >>> > Oda Mae Brown - Aprecie sem modera=3DE7=3DE3o.
>    > >>> > http://www.odamaebrown.com.br=20
>    > >>> >
>    > >>>
>    > >>>
>    > >>>
>    > >>> --
>    > >>>
>    > >>> Sem mais,
>    > >>>
>    > >>> Fabricio Mota=20
>    > >>> Oda Mae Brown - Aprecie sem modera=3DE7=3DE3o.
>    > >>> http://www.odamaebrown.com.br
>    > >>>
>    > >>> MySql-DDE discussion list=20
>    > >>> www.freelists.org/
>    > >>>
>    > >>=3D20
>    > >> MySql-DDE discussion list
>    > >> www.freelists.org/=20
>    > >>=3D20
>    > >>
>    > >
>    > >
>    > >
>    > > MySql-DDE discussion list
>    > > www.freelists.org/
>    > >
>    > >
>    >
>    > MySql-DDE discussion list
>    > www.freelists.org/
>    >
>
>
>
>
>
>
>  --=20
>
>  Sem mais,
>
>  Fabricio Mota
>  Oda Mae Brown - Aprecie sem modera=E7=E3o.=20
>  http://www.odamaebrown.com.br
>
>
>
> MySql-DDE discussion list
> www.freelists.org/
>
> 

MySql-DDE discussion list
www.freelists.org/

Other related posts: