From ZmnSCPxj at protonmail.com Wed Jun 2 09:30:59 2021 From: ZmnSCPxj at protonmail.com (ZmnSCPxj) Date: Tue, 01 Jun 2021 23:30:59 +0000 Subject: [c-lightning] Routing: bias against smaller channels In-Reply-To: References: <102B86C6-73EE-4CF1-9353-84D9E60203CF@nibnalin.me> <7df2f90a3a31edc5611d23ff27942d72@schmoock.net> <5f4691b90d85f515cb9f7403e109e880@schmoock.net> Message-ID: > Hi, > > > In code this looks like this: > > > > https://github.com/lightningd/plugins/pull/246/commits/2dbff662f712c7883169e7557f5170e356c942e8 > > Interesting! So if I understand this routine correctly, this finds a path with the least number of hops, and in the subset of paths of that length, a path with high msat (likely just the one with the?highest?msat in practice). > > I suppose it's not surprising that pathfinding's O(maxhop*(N log)) isn't the bottleneck for most practical usage, Yes. However it does somewhat assume that the published network will not grow in the future. If it **does** grow, then the n * n log n will be significant. > but to add to ZmnSCPxj's comment, if I'm not mistaken, I think we can redefine the edge weights in the graph and run the equivalent of your subroutine in a single pass of dijkstra. Correct, but that is the sticking point in the reluctance to modify `getroute`! For example, we have a `riskfactor` argument to `getroute`. This translates the CLTV-delta cost on each hop into an `msatoshi` cost that is then added to the fee to that hop, to generate the cost ("edge weight") of that hop. If we add another conversion factor from channel size (or log(channel size) --- and what would the base of the log be as well?) then that adds more parameters to pass to `getroute`. Personally I do not really have an intuitive grasp of the existing `riskfactor` argument to `getroute`, and have to look up the table in the docs all the time. I imagine that if we add another parameter to factor in the channel size, the same effect will apply (i.e. it is difficult to intuitively understand the parameter that converts channel size to the edge weight). That is why there is an idea to just expose the routemap directly, so that plugins can provide their own exact pathfinding algorithm. -- Another idea would be to pass in a "function" to the `getroute` method. Basically, `getroute` would be given two optional arguments, `costfunctionmethod` and `costfunctionargs`. The first would be a string, the second would be a generic JSON object. * A plugin that wants to provide its own edge weight computation to `getroute` would first implement its own unique cost function as an ordinary plugin command. * Then it would pass in the name of that command as the `costfunctionmethod` to `getroute`. * `getroute` would then invoke the given method with the channel details (i.e. similar to an object in `listchannels` output?), which should return a `{"cost": 42.0}` object. * `getroute` would then cache the cost for every channel it has to open, in order to reduce the RPC-call overhead. * Optionally, if the function requires additional arguments, the plugin could add `costfunctionargs`, and any object keys in that object will be added to the arguments to the given `costfunctionmethod`. Basically the above just uses the command name as a function address, and implements a closure by explicitly passing in additional arguments to the function. What is more, as I noted in my presentation in Berlin in 2019, Dijkstra, A\*, and Greedy Best First Search are really a continuum of related algorithms. The core of this family of algorithms is that they have **two** cost functions, `G()` and `H()`. They form the "cumulative" cost function `F() = G() + H()`. `G()` is the "actual distance from current node to next node". `H()` is the "estimated distance from next node to destination" i.e. the "heuristic". In Dijkstra at one end, `H()` always returns 0. At the other end, Greedy Best First Search, `G()` always returns 0. In between them, A\* has non-zero `H()` (biased to underestimate; if it overestimates, it starts to move towards Greedy Best First) and non-zero `G()`. We can instead have a `getroutealgo` command that accepts two functions, `G()` and `H()` using the above technique ("pass in the method name and any extra method args, i.e. closure"), which would also allow plugins to experiment with using A* and Greedy Best First by simply playing around with the cost functions. Then we could also have a `payalgo` command that accepts a `getroute`-like replacement function, using the closure technique, so that plugins can experiment with various routing functions without having to reimplement the complicated `pay` logic. -- The difficulty here is that `getroute` is really implemented over in `gossipd`, and the interfaces to the plugins are over in the main daemon `lightningd`. So we may very well need to expose the on-disk routemap format to `lightningd` at least so that `getroute` is instead implemented there, where it can poke at the plugin RPC interface. Thoughts, Rusty? Regards, ZmnSCPxj From michael at schmoock.net Thu Jun 3 01:46:19 2021 From: michael at schmoock.net (michael at schmoock.net) Date: Wed, 02 Jun 2021 17:46:19 +0200 Subject: [c-lightning] Routing: bias against smaller channels In-Reply-To: References: <102B86C6-73EE-4CF1-9353-84D9E60203CF@nibnalin.me> <7df2f90a3a31edc5611d23ff27942d72@schmoock.net> <5f4691b90d85f515cb9f7403e109e880@schmoock.net> Message-ID: <2937df42f0cb93ed2c8e698a1ddc5290@schmoock.net> Hi, I made some test by pre-filtering the routes on a maxhops/capacity basis which prefer "shorter and bigger pipes first" approach as mentioned earlier. Here are some results: # Disable both filters (like without this PR, just the erring node filter= #> lnc rebalance outgoing_scid=685247x957x0 incoming_scid=685132x1087x0 msatoshi=0.005btc Plugin rebalance initialized with 1000msat base / 10 ppm fee cltv_final:18 maxhops:0 msatfactor:1.0 maxhops:20 msatfactor:1.0 running_for:4 count_getroute:1 time_getroute:0.16330862045288086 time_getroute_avg:0.16330862045288086 count_sendpay:1 time_sendpay:4.338505029678345 time_sendpay_avg:4.338505029678345 maxhops:20 msatfactor:1.0 running_for:8 count_getroute:2 time_getroute:0.46318650245666504 time_getroute_avg:0.23159325122833252 count_sendpay:2 time_sendpay:7.869954586029053 time_sendpay_avg:3.9349772930145264 maxhops:20 msatfactor:1.0 running_for:12 count_getroute:3 time_getroute:0.6282370090484619 time_getroute_avg:0.2094123363494873 count_sendpay:3 time_sendpay:11.94294023513794 time_sendpay_avg:3.980980078379313 maxhops:20 msatfactor:1.0 running_for:19 count_getroute:4 time_getroute:0.845451831817627 time_getroute_avg:0.21136295795440674 count_sendpay:4 time_sendpay:18.372679471969604 time_sendpay_avg:4.593169867992401 maxhops:20 msatfactor:1.0 running_for:23 count_getroute:5 time_getroute:1.018324375152588 time_getroute_avg:0.20366487503051758 count_sendpay:5 time_sendpay:22.277318716049194 time_sendpay_avg:4.4554637432098385 maxhops:20 msatfactor:1.0 running_for:27 count_getroute:6 time_getroute:1.1925179958343506 time_getroute_avg:0.1987529993057251 count_sendpay:6 time_sendpay:25.69105839729309 time_sendpay_avg:4.281843066215515 maxhops:20 msatfactor:1.0 running_for:31 count_getroute:7 time_getroute:1.4166362285614014 time_getroute_avg:0.2023766040802002 count_sendpay:7 time_sendpay:29.830421209335327 time_sendpay_avg:4.261488744190761 maxhops:20 msatfactor:1.0 running_for:35 count_getroute:8 time_getroute:1.590040683746338 time_getroute_avg:0.19875508546829224 count_sendpay:8 time_sendpay:33.221359729766846 time_sendpay_avg:4.152669966220856 maxhops:20 msatfactor:1.0 running_for:38 count_getroute:9 time_getroute:1.763425588607788 time_getroute_avg:0.19593617651197645 count_sendpay:9 time_sendpay:36.21986770629883 time_sendpay_avg:4.024429745144314 maxhops:20 msatfactor:1.0 running_for:42 count_getroute:10 time_getroute:1.9394252300262451 time_getroute_avg:0.19394252300262452 count_sendpay:10 time_sendpay:39.911234855651855 time_sendpay_avg:3.9911234855651854 maxhops:20 msatfactor:1.0 running_for:44 count_getroute:11 time_getroute:2.115779161453247 time_getroute_avg:0.19234356013211337 count_sendpay:11 time_sendpay:42.25611186027527 time_sendpay_avg:3.841464714570479 maxhops:20 msatfactor:1.0 running_for:49 count_getroute:12 time_getroute:2.32531476020813 time_getroute_avg:0.19377623001734415 count_sendpay:12 time_sendpay:46.37626266479492 time_sendpay_avg:3.8646885553995767 maxhops:20 msatfactor:1.0 running_for:51 count_getroute:13 time_getroute:2.5044617652893066 time_getroute_avg:0.19265090502225435 count_sendpay:13 time_sendpay:48.622201442718506 time_sendpay_avg:3.740169341747577 maxhops:20 msatfactor:1.0 running_for:54 count_getroute:14 time_getroute:2.6831555366516113 time_getroute_avg:0.19165396690368652 count_sendpay:14 time_sendpay:51.87213206291199 time_sendpay_avg:3.705152290207999 maxhops:20 msatfactor:1.0 running_for:58 count_getroute:15 time_getroute:2.9157254695892334 time_getroute_avg:0.19438169797261556 count_sendpay:15 time_sendpay:55.00669240951538 time_sendpay_avg:3.6671128273010254 # Now with the currently suggested default values maxhops := 5 and msatfactor :=4 Plugin rebalance initialized with 1000msat base / 10 ppm fee cltv_final:18 maxhops:5 msatfactor:4.0 maxhops:3 msatfactor:4.0 running_for:3 count_getroute:9 time_getroute:0.06924104690551758 time_getroute_avg:0.00769344965616862 count_sendpay:1 time_sendpay:0.7183606624603271 time_sendpay_avg:0.7183606624603271 maxhops:3 msatfactor:4.0 running_for:4 count_getroute:10 time_getroute:0.1384265422821045 time_getroute_avg:0.01384265422821045 count_sendpay:2 time_sendpay:1.7709376811981201 time_sendpay_avg:0.8854688405990601 maxhops:3 msatfactor:4.0 running_for:5 count_getroute:11 time_getroute:0.20621562004089355 time_getroute_avg:0.018746874549172142 count_sendpay:3 time_sendpay:2.66377592086792 time_sendpay_avg:0.8879253069559733 We can see that: 1. CPU on excessive getroute usage isn't a big deal, even on a raspberry pi. 2. The route was found in 5 seconds compared to not found (in 60 seconds) 3. Calling sendpay is more expensive timewise by a factor. Exact value cannot be determined by the above data, as the successful route was just 5 (3+2) hops (compared to what getroute gives us without maybe 8 total hops). 4. We see that sendpay takes more time than getroute, this becomes more painful on longer routes (which no maxhops filter gives early on) Just wanted to tell you that for big chunk payments (w/o mpp) it seems to make sense to try "shorter and bigger pipes first". Michael On 2021-06-02 01:30, ZmnSCPxj wrote: >> Hi, >> >> > In code this looks like this: >> > >> > https://github.com/lightningd/plugins/pull/246/commits/2dbff662f712c7883169e7557f5170e356c942e8 >> >> Interesting! So if I understand this routine correctly, this finds a >> path with the least number of hops, and in the subset of paths of that >> length, a path with high msat (likely just the one with >> the?highest?msat in practice). >> >> I suppose it's not surprising that pathfinding's O(maxhop*(N log)) >> isn't the bottleneck for most practical usage, > > Yes. > > However it does somewhat assume that the published network will not > grow in the future. > If it **does** grow, then the n * n log n will be significant. > >> but to add to ZmnSCPxj's comment, if I'm not mistaken, I think we can >> redefine the edge weights in the graph and run the equivalent of your >> subroutine in a single pass of dijkstra. > > Correct, but that is the sticking point in the reluctance to modify > `getroute`! > > For example, we have a `riskfactor` argument to `getroute`. > This translates the CLTV-delta cost on each hop into an `msatoshi` > cost that is then added to the fee to that hop, to generate the cost > ("edge weight") of that hop. > > If we add another conversion factor from channel size (or log(channel > size) --- and what would the base of the log be as well?) then that > adds more parameters to pass to `getroute`. > > Personally I do not really have an intuitive grasp of the existing > `riskfactor` argument to `getroute`, and have to look up the table in > the docs all the time. > I imagine that if we add another parameter to factor in the channel > size, the same effect will apply (i.e. it is difficult to intuitively > understand the parameter that converts channel size to the edge > weight). > > That is why there is an idea to just expose the routemap directly, so > that plugins can provide their own exact pathfinding algorithm. > > -- > > Another idea would be to pass in a "function" to the `getroute` method. > > Basically, `getroute` would be given two optional arguments, > `costfunctionmethod` and `costfunctionargs`. > The first would be a string, the second would be a generic JSON object. > > * A plugin that wants to provide its own edge weight computation to > `getroute` would first implement its own unique cost function as an > ordinary plugin command. > * Then it would pass in the name of that command as the > `costfunctionmethod` to `getroute`. > * `getroute` would then invoke the given method with the channel > details (i.e. similar to an object in `listchannels` output?), which > should return a `{"cost": 42.0}` object. > * `getroute` would then cache the cost for every channel it has to > open, in order to reduce the RPC-call overhead. > * Optionally, if the function requires additional arguments, the > plugin could add `costfunctionargs`, and any object keys in that > object will be added to the arguments to the given > `costfunctionmethod`. > > Basically the above just uses the command name as a function address, > and implements a closure by explicitly passing in additional arguments > to the function. > > What is more, as I noted in my presentation in Berlin in 2019, > Dijkstra, A\*, and Greedy Best First Search are really a continuum of > related algorithms. > The core of this family of algorithms is that they have **two** cost > functions, `G()` and `H()`. > They form the "cumulative" cost function `F() = G() + H()`. > > `G()` is the "actual distance from current node to next node". > `H()` is the "estimated distance from next node to destination" i.e. > the "heuristic". > > In Dijkstra at one end, `H()` always returns 0. > At the other end, Greedy Best First Search, `G()` always returns 0. > > In between them, A\* has non-zero `H()` (biased to underestimate; if > it overestimates, it starts to move towards Greedy Best First) and > non-zero `G()`. > > We can instead have a `getroutealgo` command that accepts two > functions, `G()` and `H()` using the above technique ("pass in the > method name and any extra method args, i.e. closure"), which would > also allow plugins to experiment with using A* and Greedy Best First > by simply playing around with the cost functions. > > Then we could also have a `payalgo` command that accepts a > `getroute`-like replacement function, using the closure technique, so > that plugins can experiment with various routing functions without > having to reimplement the complicated `pay` logic. > > -- > > The difficulty here is that `getroute` is really implemented over in > `gossipd`, and the interfaces to the plugins are over in the main > daemon `lightningd`. > So we may very well need to expose the on-disk routemap format to > `lightningd` at least so that `getroute` is instead implemented there, > where it can poke at the plugin RPC interface. > > Thoughts, Rusty? > > Regards, > ZmnSCPxj