~dmbaturin/soupault

3 3

[INPUT NEEDED] Plugin API for using multiple selectors

Details
Message ID
<6c6aee43-5878-4d48-0524-5e76af20ef7c@baturin.org>
DKIM signature
pass
Download raw message
Hi everyone,

Functions that fake support for multiple selectors in Soup.select
existed in the Utils module for a while, but they weren't available to
plugins.

Recentlty I've added HTML.select_any_of and HTML.select_all_of that
expose those functions.

However, I wonder if faking the CSS syntax for it in
HTML.select/HTML.select_one is a better idea.
That is, allow the user write HTML.select(page, ".foo, .bar") and split
the selector string at commas.

If and when lambdasoup gets real support for it, we can simply remove
that fixup and old plugins will still work.

What do you think?
Thomas Letan
Details
Message ID
<20200305161013.4xw2mx4xtzzsgcar@ideepad.localdomain>
In-Reply-To
<6c6aee43-5878-4d48-0524-5e76af20ef7c@baturin.org> (view parent)
DKIM signature
pass
Download raw message
Hi,

> If and when lambdasoup gets real support for it, we can simply remove
> that fixup and old plugins will still work.

My understanding is that we should investigate fixing the issue upstream
first, since soupault 1.9 is relatively young still and there is no rush
releasing a soupault 1.10 just yet.

Do you think this is something that could be done? Are lambdasoup
interesting in such a contribution?
Details
Message ID
<f78ec8fa-e3d7-3b71-d571-cd1fa81a66fb@baturin.org>
In-Reply-To
<20200305161013.4xw2mx4xtzzsgcar@ideepad.localdomain> (view parent)
DKIM signature
pass
Download raw message
There's an open issue for it:
https://github.com/aantron/lambdasoup/issues/15
That change is relatively straightforward, but not very simple.

On 3/5/20 11:10 PM, Thomas Letan wrote:
> Hi,
>
>> If and when lambdasoup gets real support for it, we can simply remove
>> that fixup and old plugins will still work.
> My understanding is that we should investigate fixing the issue upstream
> first, since soupault 1.9 is relatively young still and there is no rush
> releasing a soupault 1.10 just yet.
>
> Do you think this is something that could be done? Are lambdasoup
> interesting in such a contribution?
Details
Message ID
<28dd37de-9489-cc75-b71f-d509f1bcf193@aoirthoir.com>
In-Reply-To
<20200305161013.4xw2mx4xtzzsgcar@ideepad.localdomain> (view parent)
DKIM signature
pass
Download raw message
I have implemented this ... it was my opinion that multiple selectors 
should be done the standard CSS way.. comma separated.. so here is the 
lua code with some explanations..

By the time the ADD1.loop_selectors is called, the selectors should have 
already been split into an array, from the commas. If selectors are not 
sent then the variable ADD1.selectors will be used. That uses the widget 
setting selectors = "p,a,etc,etc" and using Regex.split, creates the 
array. I have a couple functions in my library that make sure this is 
all automated so I can be sure that ADD1.selectors array will have 
converted widget setting selectors to the array. If the widget setting 
is non existent and no array is passed as selectors then what we get is 
a search for a long error message as if it were a selector, which 
obviously it would not be so no selectors would match and  no harm no foul.

These are the two functions that matter most

1. ADD1.loop_selectors will loop through the LIST of selectors, then for 
each it will call:
2. ADD1.loop_elements which will loop through all the elements matching 
a single selector then it will perform whatever lua code was passed.

You can literally call ADD1.loop_selectors('somecode') without 
specifying selectors, or elementsApply or selectorsApply as these values 
will automatically be taken from the widget settings. Or you can give 
more detail.. .for instance...

The code for my add-class.lua plugin is this simple:

Plugin.require_version('1.8')
dofile('plugins/add1tocobol/add1tocobol.lua')
if not config.class then Plugin.exit('add-class.lua: No class=') end
ADD1.loop_selectors('HTML.add_class(element, config.class)')

Notice the only parameter being passed is the Lua code we want to 
perform and it is really simple... all the widget settings besides 
config.class that are needed from will be obtained automatically from 
add1tocobol.lua library... ADD1.loop_selectors() will loop through all 
the selectors...and then all the elements matching each selector..

You can also do something like create a function, then call that 
function here is the colorize-characters.lua plugin...

Plugin.require_version('1.8')
dofile('plugins/add1tocobol/add1tocobol.lua')
function colorize_characters()
   data = ADD1.wrap_characters_with_tag(element,'span')
   ADD1.process_element('replace_content', element, data, '')
   ADD1.loop_elements(element,'span','all','colorize_code()')
end
function colorize_code()
   ADD1.apply_random_color(element,'color',config.rejected)
end
ADD1.loop_selectors('colorize_characters()')

Notice I am specifically calling loop_elements so that i can loop 
through the span elements that i just created in the current element.. 
Lua is well suited to these kind of recursive calls and it keeps track 
of where it is at...

documentation relevant list:

http://soupault.add1tocobol.com/add1tocobol-lua/

http://soupault.add1tocobol.com/add1tocobol-lua/#selectors-add1-setting

http://soupault.add1tocobol.com/add1tocobol-lua/#selectors-apply-and-elements-apply-add1-settings

Now the actual current versions of ADD1.loop_selectors() function and 
ADD1.loop_elements() function:

-- *********************************************************************
-- * ADD1.loop_selectors()
-- *********************************************************************
   function ADD1.loop_selectors(code,selectors,selectorsApply,elementsApply)
     local results  = 'none'
     if not selectors then selectors = ADD1.selectors end
     if (selectorsApply ~= 'first' and selectorsApply ~= 'all') then
       selectorsApply = ADD1.selectorsApply
     end
     local selectorsCount = size(selectors)
     local selectorsIndex = 1
     local selector
     while (selectorsIndex <= selectorsCount) do
       selector = selectors[selectorsIndex]
       results = ADD1.loop_elements(page,selector,elementsApply,code)
       if (selectorsApply == 'first' and results == 'success') then
         selectorsIndex = selectorsCount + 1
       elseif selectorsApply == 'all' then
         selectorsIndex = selectorsIndex + 1
       else
         selectorsIndex = selectorsIndex + 1
       end
     end
     return results
   end

-- *********************************************************************
-- * ADD1.loop_elements()
-- *********************************************************************
   function ADD1.loop_elements(parent,selector,elementsApply,code)
     local results = 'none'
     if (elementsApply ~= 'first' and elementsApply ~= 'all') then
       elementsApply = ADD1.elementsApply
     end
     local elements
     local elementsCount
     local elementsIndex
     elements = HTML.select(parent, selector)
     elementsCount = size(elements)
     elementsIndex = 1
     while (elementsIndex <= elementsCount) do
       element = elements[elementsIndex]
       dostring(code)
       results = 'success'
       if elementsApply == 'first' then
         elementsIndex = elementsCount + 1
       elseif elementsApply == 'all' then
         elementsIndex = elementsIndex + 1
       else
         elementsIndex = elementsIndex + 1
       end
     end
     return results
   end