[aodvv2-discuss] Re: Outstanding JWDs from April 4th (Lotte's email and attachment "2")

From: Lotte Steenbrink <lotte.steenbrink@xxxxxxxxxxxx>
To: aodvv2-discuss@xxxxxxxxxxxxx
Date: Tue, 19 Apr 2016 23:42:38 +0200

Correction: not 9, but 8 TODOs left. :) Now for the other E-Mails...

Am 19.04.2016 um 23:34 schrieb Lotte Steenbrink
<lotte.steenbrink@xxxxxxxxxxxx>:

Hi Vicky, hi everyone,

Am 15.04.2016 um 21:44 schrieb Victoria Mercieca <vmercieca0@xxxxxxxxx
<mailto:vmercieca0@xxxxxxxxx>>:

Hi all,

I finally found some time to go through everything marked with a JWD TODO
from Lotte's email on 4th April. I know there have already been some
comments, so I apologise if I am re-iterating things that have already been
updated!
I've listed all the TODOs I found (although there's a couple I haven't
commented on).

Here goes.....

Valid route
      A route that can be used for forwarding. (the deleted part should be
defined in Route Error Generation section JWD TODO: why there? If you hink
it's a bit too verbose at that place, Local Route Set  {#rte} has an
in-depth definition what a valid route is and isn't )

I agree, the extra explanation doesn't need to go here. The deleted bit was:
which has been confirmed as having a bidirectional link to the next hop,
and has not timed out or been made invalid by a route error.

"Local Route Set" is the right place for this information, and the
explanation already there is fine. I vote to remove this bit from the
Terminology, and not insert anywhere else. If the reader wants extra info,
maybe include a reference to "Local Route Set" section with the terminology
entries for Valid route and Invalid route, e.g. "Further information can be
found in Section __“.

Yup, done :)

---

Unreachable Address
      An address reported in a Route Error message. (the deleted part should
be defined in Route Error Generation section JWD TODO)

The deleted part was:
either the address on a LocalRoute which became Invalid, or the destination
address of an IP packet that could not be forwarded because a valid
LocalRoute to the destination is not known, and will not be requested.

I think this got added based on some previous comments, but I agree it is
unnecessary in Terminology. I think the RERR Generation section is already
pretty clear on what to use for "unreachable address" and all the
circumstances involved. I vote to remove the bit Justin suggested, and not
insert it anywhere else.

Yup. However, while I couldn’t really find a suitable place for it in 7.4.1.
RERR Generation, it wasn’t defined elsewhere either (apart from the
terminology section, of course)– maybe that could be done in 4.5.  Local
Route Set ?

---

3.  Applicability Statement
(Wireless properties of the network is not mentioned at all and this is
quite important as I’m pretty sure AODVv2 for used in a wired network would
be a bad idea JWD TODO)

How about changing the first sentence from:
The AODVv2 routing protocol is a reactive routing protocol.
to:
The AODVv2 routing protocol is a reactive routing protocol intended for use
in mobile ad hoc wireless networks.

Fine by me. Changed. :)

---

AODVv2 is designed for stub or disconnected mobile ad hoc networks, i.e.,
non-transit networks or those not connected to the internet. AODVv2 can,
however, be configured to perform gateway functions when attached to
external networks, as discussed in Section 9. (some mention of limitations
of the provided approach is required.  No default gateways,
drawbacks/overhead of using AODVv2 network to connect to the Internet using
provided approach JWD TODO)

---

AODVv2 provides for message integrity and security against replay attacks by
using integrity check values, timestamps and sequence numbers, as described
in Section 13.  If security associations can be established, encryption can
be used for AODVv2 messages to ensure that only trusted routers participate
in routing operations.
(There is an issue with trust being limited to single hops and no way to
verify information more than one hop away JWD TODO)

Ongoing discussion.

---

AODVv2 supports routers with multiple interfaces and multiple IP addresses
per interface.   (this should be disallowed as it can form non valid routes
with uni-directional links JWD TODO)

I think that it can form non-valid routes with unidirectional links, *if*
multiple interfaces use the same IP address (and I think we included an
interface identifier and some rules about which interface a message should
be sent on, to get around this issue). I think our statement is now valid.

Okay.

---

4.2.  Router Client Table
What’s the difference between a table and a list why are we using one over
the other?JWD TODO)

Are we renaming as much as possible to be Sets? This could easily be Router
Client Set. Also we have InterfaceSet, we could have Neighbor Set, we have
Local Route Set, can change to Multicast Route Message Set and RERR Set.

Yeah, I’ve got a „revisit set vs table vs list wording“ item in my Trello
board, but I’ve never gotten round to doing that. :/ I’ve made the changes
you’ve suggested, if you have more suggestions I’ll be happy to incorporate
those as well. (I hope we won’t get flak for the confusing wording now? I’m
guessing tables are the most common way to represent this kind of info...)

---

5.  Metrics: (consider adding a maximum single hop value JWD TODO)

What was the consensus on this, before the discussion erupted into a number
of metric issues? It would exclude poor links from route discovery,
potentially with route discovery failing even though a route did exist. I
vote not to have a maximum single hop value, i.e. no maximum link cost.

+1. Iirc, Charlie kindly did propose some text for that, but he also said
that he doesn’t really want a maximum value...

---

6.3.  Neighbor Table Update
If an RREP_Ack is not received within the expected time (what is the
expected time? list the entry we are checking assuming in the Multicast
Route Message Tablet? outline it. JWD TODO: The MRMT is not the right place,
but I'm nt sure if we're descrivng timers for this anywhere? shoud we?)

Version 14 now says: "within RREP_Ack_SENT_TIMEOUT" but doesn't refer to how
we know this time is up. The RREP we are expecting to be acked would be in
the MRMT, wouldnt it, since it would have been multicast?

Right, it would, since RREPs with an AckReq are Multicast! You’re brilliant
Vicky, I kept getting confusing myself about this.

So we would be testing "If an RREP_Ack is not received in response to a
multicast RREP, within RREP_Ack_SENT_TIMEOUT of the time the RREP was sent,"
and if we need to be more explicit "within RREP_Ack_SENT_TIMEOUT of the
Timestamp of the Multicast Route Message Entry corresponding to the
RREP,"... but that is hideously wordy.

I’ve reworded it to:

If the  Multicast Route Message Set contains an entry where:

* RteMsg.MessageType == RREP
* RteMsg.AckReqAddr == Neighbor.IPAddress

* RteMsg.Timestamp + RREP_Ack_SENT_TIMEOUT < CurrentTime

the link is considered to be uni-directional and the Neighbor Set entry is
updated as follows:

* Neighbor.State := Blacklisted
* Neighbor.ResetTime := CurrentTime + MAX_BLACKLIST_TIME

what do you think?

(also, note the todo: why does the MRMS not have an interface field? Does
that make sense or did I just forget to add that when I introduced
interfaces?)

Also I guess we need to state that there should be a timer running, so that
when it expires we can set the neighbor as blacklisted. Do timers need their
own list? That's too implementation specific, surely?

Well, there is (is there some table that lists routes that are being waited
on? JWD), but I think we can avoid creating that list with the above text?
Just keep an eye on the MRMS...

---

When a link to a neighbor is determined to be broken, the NeighborTable
entry SHOULD be removed.(What does this mean?  How would one determine a
link to a neighbor to be broken….suggest removing it JWD TODO)

I guess this is an external signal. Do we remove this statement, or expand
it further? e.g. "If a link to a neighbor is determined to be broken
according to a signal from an external process..." I think its useful to
indicate when an entry would be deleted.

+1. In the „New Draft“ thread, I wrote:

"
He’s asking the following to be removed:

When a link to a neighbor is determined to be broken, the Neighbor
   Table entry SHOULD be removed.

We’ve had another issue with the word „broken“ in 6.10.1.  LocalRoute State
Changes, where we’ve changed the wording to „If an external mechanism reports
a link as broken, …“ I think we should to the same here (instead of just
removing the sentence).
„

… I did that now. :)

Otherwise, do we keep the entry forever, in case the link ever becomes
bidirectional?

Kind of? We blacklist it and then we try again after some time, right?

---

Lowest priority SHOULD be given to RERR messages generated in response to
RREP messages which cannot be regenerated.  In this case the route request
will be retried at a later point.

(It’s not clear to me how this priority works.  Do the messages just get
dropped when tsome threshold for max number of messages is reached?  Is
there some queuing method because if their isn’t I don’t see how priority
would work at all, as the choices are drop or send.  This whole section
seems out of place as we have’t talked about generating messages. I’d
suggest moving or removing JWD DONE partially ([master e9927a8] redo
priority definition))

Did we decide that when we hit a certain limit, we then remove queued
messages of lower priorities, so that we can instead queue messages of
higher priorities? And if the queue is full of higher priority messages,
drop (or avoid creation of) any new messages?

e.g. some suggested text ready for comments/criticisms.
"To implement the congestion control, a queue length is set. If the queue is
full, in order to queue a new message, a message of lower priority must be
removed from the queue. If this is not possible, the new message MUST be
discarded. The queue should be sorted in order of message priority."

Or is there a less "this is how you MUST implement it" sort of statement we
could use instead like "its up to an implentor to decide if and how to do
this?“

iirc, this was discussed in the „Justin's review“ thread, and it currently
ends with a question from Stan– “when we reach the condition of having a full
queue of packets to send, do we tail-drop (drop new messages)? Or do we
remove the oldest message from the head of the queue? I'm thinking tail-drop
is the right thing to do, but I want folks to at least give it a thought. ”

Since you and stan seem to agree, I’ve used your text :) The only thing I’m
having issues with is the „a queue length is set“ thing– we don’t want to
define a length, do we? So why do we need this part of the sentence? Or do
you mean „the current queue length is recorded and shouldn’t increase until
this has been sorted out“?

---

RREQ_Gen awaits reception of a Route Reply message (RREP) containing a route
toward TargAddr. (is there some table that lists routes that are being
waited on? JWD TODO)

Is it enough that the RREQ message will still be in the MRMT?

Is it better to slightly re-write, maybe like this: "RREQ_Gen awaits
reception of a Route Reply message (RREP) corresponding to the RREQ. This
RREP contains a route toward TargAddr, which is installed at RREQ_Gen,
completing the route discovery procedure."

Similar to above...do we need to state explicitly that there should be a
timer running, so that when it expires we can retry the route discovery? Do
timers need their own list? That's too implementation specific, surely?

See above, I think we can merge those discussions? :)

---

7.1.3.  RREQ Regeneration
There might be a problem with very short validity times on very low metric
routes in that they will be selected regardless of the time. Not a problem
but something that could cause issues. Perhaps some minimum time should be
mandated. JWD TODO)

I'm happy either way. I think Charlie mentioned it in a recent email too. If
we mandate a minimum time, what should it be? This then needs to be
mentioned throughout the draft in multiple generation/regeneration sections,
stating that the message should not be regenerated if the validity time is
less than the minimum.

+1, I don’t mind setting another constant but I won’t pull one out of my
nose. but if someone wants to propose some text, I’ll be happy to add it.

---

7.1.1.  RREQ Generation
and all generation/regeneration sections

   If the limit for the rate of AODVv2 control message generation has
   been reached, no message SHOULD be generated.  If approaching the
   limit, the message should be sent if the priorities in Section 6.5
   allow it.
(There is no defined behavior here. Approaching the limit? Section 6.5
outlines which messages are more important but now how to decide to allow
them to be transmitted or not. JWD TODO)

Fixed as above?

Yes, i’ve deleted all „If approaching the limit“ sentences. That means we’re
down to 11 TODOs now! hooray! :)

---

7.2.3.  RREP Regeneration
The router MAY choose not to regenerate the RREP, in the same way it MAY
choose not to regenerate an RREQ (see Section 7.1.3), though this could
decrease connectivity in the network or result in non-optimal paths.
(This seems like a bad idea to do on the reverse path.  It would be a good
way to attack a network Forward RREQ and drop all RREP.  JWD DONE?)

If there is a router which doesnt forward a RREP, but does send back an Ack
toward TargAddr, then the next hop router would not blacklist this router.
And the RREP doesnt reach RREQ_Gen, so a new RREQ will go out and the same
will happen again. RREQ_Gen will eventually give up.

If we state that a router which doesnt forward a message must not send
acknowledgements, and that if you forward a RREQ you MUST forward a
corresponding RREP, that fixes it, right? Except for if you have reached
your congestion limit in between these two events? But then something crazy
is going on anyway. Hopefully on the retry the congestion has reduced? Or,
on the retry the RREQ probably wouldnt be forwarded by this router,
therefore would discover another path.

I think I agree with Justin. If we receive a RREP it pretty much means we
sent the RREQ, so we should do our best to regenerate the RREP. Let's not
allow "The router MAY choose not to regenerate the RREP"….

Yup, the draft currently says:

The RREP SHOULD NOT be regenerated if CONTROL_TRAFFIC_LIMIT
has been reached. Otherwise, the router MUST regenerate the RREP.

I’ve still got a little Trello card that lists „Security Considerations:
mention forward-RREQ&drop-RREP“ as a TODO, if anyone has time to write me
some text to copy & paste I’ll happily do that, otherwise I’ll have to see if
I can come up with something before the deadline.

---

7.2.3.  RREP Regeneration
   4.  If the link to the next hop router toward OrigAddr is not known
       to be bidirectional, include the AckReq with the address of the
       intended next hop router
(THere needs to be some sort of table which is updated  with timer
associated with this AckReq JWD TODO)

In this case the RREP was multicast, so should be in the MRMT, along with a
timestamp.

Same as above...do we need to state explicitly that there should be a timer
running, so that when it expires we can set the neighbor as blacklisted? Do
timers need their own list? That's too implementation specific, surely?

Yeah, See above. :)

---

7.3.2.  RREP_Ack Reception
   Upon receiving an RREP_Ack, an AODVv2 router performs the following steps:
   1.  Update the Neighbor Table according to Section 6.3
       *  If the sender has Neighbor.State set to Blacklisted after the
update, ignore this RREQ for further processing.
(Shouldn’t this check come before updating?  The only way the Neighbor.State
is set to Blacklisted is that its already Blacklisted and there is no need
to update the Neighbor Table. JWD TODO)

I think 6.3 says whether to update the state from blacklisted (if the reset
time has been reached). Because otherwise we didnt specifically mandate a
timer which updates that entry to stop the neighbour being blacklisted. The
fact that the neighbor *remains* blacklisted after checking the reset time,
that is what's important in step 1.

Yeah, but the wording is a bit misleading, isn’t it? I’ve rephrased it to say

1.  Check and update the Neighbor Set according to [](#nbrupdate)
    *  If the sender has Neighbor.State set to Blacklisted, ignore this RREQ
for further processing.

---

8.1.2.  Message TLV Block
and all other message TLV block sections
   An RREQ contains no Message TLVs.
(this ins’t mandatory, right? Just AODVv2 doesn’t define any message TLVs
for use with RREQ JWD TODO?)

Did we suggest updating to something like "This draft specifies/requires no
Message TLVs for this message type“?

Yeah, Charlie proposed „AODVv2 does not define any Message TLVs for an RREQ
message.“ and that’s what it (and the other, similar sections) says now. :)

---

8.3.3.  Address Block
   An RREP_Ack contains no Address Block.
(Mandatory or just not defined? Guessing mandatory on this one. JWD TODO)

Any address block would be ignored by the current draft. Again something
like "This draft requires no Address Block for this message type"?

Same for the Address Block TLV Block for the RREP_Ack.

Yup. :)

---

  When a LocalRoute is expunged, any precursor list associated with it MUST
also be expunged.
(I feel like this is underspecified and would be better served in a separate
document like Intermediate RREP.  Is there anything wrong with leaving the
behavior required a MAY and describing that behavior in another draft? JWD
TODO)

Are we ok to remove extensions to future separate drafts?

I can’t remember? I think at some point we agreed to remove all but one? Ugh,
mailing lists are the worst issue trackers :(

To enable precursor lists, maybe we need something in RERR
generation/regeneration sections to state that a RERR "SHOULD" be multicast,
rather than MUST (for the RERR without PktSource). This allows other ways of
sending the RERR (such as unicast to specific precursors), according to RFC
2119's meaning of SHOULD, I think?

Makes sense. done.

To enable expanding rings multicast we either need to reintroduce
msg-hop-limit and msg-hop-count, or define a new hop_count TLV.

*sigh* the rfc5444 expert jury is still out on that whole thing, right?

Then I think where we have statements like "MAY decide not to regenerate the
RREQ", this covers the use of expanding rings multicast as an extension.

Well, we don’t explicitly say that the RREQ MUST be (re)generated, so we’ve
already got the loophole we need, don’t we?

For iRREP, we already have an extension draft. I'm hazy on the details, is
there anything we need to state in our draft to allow this extension?

For message aggregation delay, that's a 5444 option. That's probably
something we would configure in whatever multiplexer we might use. Not sure
its worth writing an extension for AODVv2 for that.

Yeah, that section doesn’t have much content, does it? Everything it
describes is 5444 territory anyway. I'll just delete that. OK?

---

(is MAX_SEQNUM_LIFETIME the only parameter which MUST be configured the same
across the network?  JWD TODO: yes?)

Other timers are active interval, max idletime, max blacklist time, rtemsg
entry time, rreq wait time, rrep ack sent timeout and rreq holddown time.

I'd say ACTIVE_INTERVAL and MAX_IDLETIME should be configured the same
across the network too...just because if they affect one hop in an
established route, they affect all hops.

Charlie wrote the bit about what happens if constants are different on
different routers, I think this got pasted from there (11.2). Should we do
the same for timers? Seems like common sense really:

- Routers with lower values for ACTIVE_INTERVAL + MAX_IDLETIME will
invalidate routes more quickly and free resources used to maintain them.
This can affect bursty traffic flows which have quiet periods longer than
ACTIVE_INTERVAL + MAX_IDLETIME. A route which has timed out due to perceived
inactivity is not reported. When the bursty traffic resumes, it would cause
a RERR to be generated, and the traffic itself would be dropped. This route
would be removed from all upstream routers, even if those upstream routers
had larger ACTIVE_INTERVAL or MAX_IDLETIME values. A new route discovery
would be required to re-establish the route, causing extra routing protocol
traffic and disturbance to the bursty traffic.
- Routers with lower values for MAX_BLACKLIST_TIME would allow neighboring
routers to participate in route discovery sooner than routers with higher
values. This could result in failed route discoveries if un-blacklisted
links are still uni-directional. Since RREQs are retried, this would not
affect success of route discovery unless this value was so small as to
un-blacklist the router before the RREQ is retried. This value need not be
consistent across the network since it is used for maintaining a 1-hop
blacklist. However it MUST be greater than RREQ_WAIT_TIME.
[and probably a good idea to be at least a multiple of RREQ_WAIT_TIME?]
- Routers with lower values for RERR_TIMEOUT may create more RERR messages
than routers with higher values. This value should be large enough that a
RERR will reach all routers using the route reported within it before the
timer expires, so that no further data traffic will arrive, and no
duplicated RERR messages will be generated.
- Routers with lower values for RteMsg_ENTRY_TIME may not consider received
redundant multicast route messages as redundant, and may regenerate these
messages unnecessarily.
- Routers with lower values for RREQ_WAIT_TIME may send more frequent RREQ
messages and wrongly determine that a route does not exist, if the delay in
receiving an RREP is greater than this interval.
- Routers with lower values for RREP_Ack_SENT_TIMEOUT may wrongly determine
links to neighbors to be unidirectional if an RREP_Ack is delayed longer
than this timeout.
- Routers with lower values for RREQ_HOLDDOWN_TIME will retry failed route
discoveries sooner than routers with higher values. This may be an advantage
if the network topology is frequently changing, or may unnecessarily cause
more routing protocol traffic.

I’m assuming this is your proposed text, right? I’ve added it to the draft
and appended

MAX_SEQNUM_LIFETIME, ACTIVE_INTERVAL and MAX_IDLETIME MUST be configured to
have the same values for all AODVv2 routers in the network.

---

o  RREP_RETRIES: Routers with lower values are more likely to
      blacklist neighbors when there is a

Um... where did the rest of this sentence go? Must be my fault! Draft 11
says this:

RREP_RETRIES: Nodes with lower values are more likely to blacklist
      neighbors when there is a temporary fluctuation in link quality.

whoops, I think the last part of that sentence got moved down. anyway, fixed,
now!

…And with all the above changes made, we’re down to 9 TODOs! Great work
everyone :)

---

Now for the RFC5444 usage draft…!

I’m curious what you’ve got to say! Also, in case you folks agree with me
regarding the api/list of don’ts/whatever, I think I could use some back-up
there ;)

Best regards,
Lotte

Kind regards,
Vicky.

Follow-Ups:
- [aodvv2-discuss] 1/CONTROL_TRAFFIC_LIMIT
  - From: Charlie Perkins

References:
- [aodvv2-discuss] Outstanding JWDs from April 4th (Lotte's email and attachment "2")
  - From: Victoria Mercieca
- [aodvv2-discuss] Re: Outstanding JWDs from April 4th (Lotte's email and attachment "2")
  - From: Lotte Steenbrink

[aodvv2-discuss] Re: Outstanding JWDs from April 4th (Lotte's email and attachment "2")

Other related posts: