Selecting the correct drop downs and exporting CSV in RVEST

I am trying to write a script that does the following:

  1. Logs into a site with a name and password
  2. Select a value from a first dropdown
  3. Select a value from a second dropdown
  4. Select a value from a third dropdown
  5. Hitting a button that will generate a CSV file based on those selections Thus far, I have gotten through Step #2, but I'm struggling with the next three.

For the record, I cannot provide a completely reproducable example given the proprietary nature of what I'm trying to do, but I will be as detailed as possible.

Address of the login webpage
create a web session with the desired login address
pgsession<-html_session(login) pgform<-html_form(pgsession)[[1]]  #in this case the submit is the 2nd form filled_form<-set_values(pgform, "ctl00$MainContent$Login1$UserName" = "abc", "ctl00$MainContent$Login1$Password" = "xyz") pg <- submit_form(pgsession, filled_form)
results <- html_nodes(pg, "select[name='CustomerNumberList'] > option") %>% html_attr("value") %>% html_nodes("select[name='StatusCode'] > option")

It is that final line where I see the first error: Error in UseMethod("xml_find_all") : no applicable method for 'xml_find_all' applied to an object of class "character"

The element and its OuterHTML looks as such:
<select name="StatusCode" id="StatusCode" onchange="ShowHideInvoiceNumber(this)" style="width:200px;" class="">
<option selected="selected" value="OPEN">ALL OPEN ORDERS</option>
<option value="BOOKED"> &nbsp; BOOKED</option>
<option value="RESERVED"> &nbsp; RESERVED</option>
<option value="CUT"> &nbsp; CUT</option>
<option value="DIRECT"> &nbsp; DIRECT ORDERS</option>
<option value="ENTERED"> &nbsp; &nbsp; ENTERED (DIRECT ORDERS)</option>
<option value="CONFIRMED"> &nbsp; &nbsp; CONFIRMED (DIRECT ORDERS)</option>
<option value="BOOKING REQUESTED"> &nbsp; &nbsp; BOOKING REQUESTED (DIRECT ORDERS)</option>
<option value="BOOKING CONFIRMED"> &nbsp; &nbsp; BOOKING CONFIRMED (DIRECT ORDERS)</option>
<option value="SHIPPED"> &nbsp; &nbsp; SHIPPED (DIRECT ORDERS)</option>
<option value="ALL PART ORDERS"> &nbsp; ALL PART ORDERS</option>
<option value="POP ORDERS"> &nbsp; &nbsp; POP ORDERS</option>
<option value="REPLACEMENT PART ORDERS"> &nbsp; &nbsp; REPLACEMENT PART ORDERS</option>
<option value="PHOTOGRAPHY ORDERS"> &nbsp; PHOTOGRAPHY ORDERS</option>
<option value="PHOTOGRAPHY ORDERS-IN PHOTOGRAPHY"> &nbsp; &nbsp; IN PHOTOGRAPHY</option>
<option value="PHOTOGRAPHY ORDERS-BOOKED"> &nbsp; &nbsp; BOOKED</option>
<option value="INVOICED">ALL INVOICED ORDERS</option>


I'd like to select the value in that final option: <option value="INVOICED">ALL INVOICED ORDERS</option>

The third and final dropdown HTML is as such:

<select name="OutputFormat" id="OutputFormat" style="width:150px;" class="">
<option selected="selected" value="HTML">HTML (Screen)</option>
<option value="Excel">Excel Spreadsheet</option>


Say I want to click the Excel option.

And finally, I need to click the button as described below, which then triggers the download:

My last question is: How does the download work in R? Does it just generate the file into my working directory?

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.