EDIT: I am using the library package DBI for connectivity and will switch to RODBC as per some comments here: DBI/Github/Issue comments
EDIT 2: RODBC package does not save the day returning error message:
[RODBC] ERROR: Could not SQLExecDirect
EDIT 3: Works fine with SQLacademy in Python.
Using SQL Server I am able to perform the following query without hassle
Outcomes AS
(
SELECT Number, OutcomeDate, EndMatter,
CAST(STUFF((SELECT ', ' + CAST(OutcomeName AS VARCHAR(MAX)) FROM Research.EndMatterOutcome a WHERE a.Number = b.Number FOR XML PATH('')),1,1,'') AS VARCHAR(MAX)) AS OutcomeName
FROM Research.EndMatterOutcome_View b
WHERE (b.OutcomeName IS NOT NULL OR CAST(b.OutcomeName AS VARCHAR(MAX)) <> '')
GROUP BY Number, OutcomeDate, EndMatter
)
Its just joinery to collapse two rows (one column) of values into one separated by a comma. When used as part of a larger query in an SQL chunk in R studio I get the following:
Error in result_fetch(res@ptr, n, ...) :
nanodbc/nanodbc.cpp:2836: 07009: [Microsoft][ODBC SQL Server Driver]Invalid Descriptor Index
I understand from StackOverflow that there are some workarounds concerning VARCHAR(MAX), such as using these in the final SELECT statement etc but these approaches are several years old, is not easy to build in with the much larger query that this is nested within and am hoping that there is another approach because as I say the query works fine via SQL Server Management Studio.
When using DBI I just connect with the configuration I have in System DSN
#DBI command
con <- dbConnect(odbc::odbc(),
"Research_Database_mrasmussen",
trusted_connection = TRUE
)
With RODBC I just use the following command
#RODBC command
con <- odbcDriverConnect('driver={SQL Server};server=ADR-PROD-DB;database=ADR_Research;trusted_connection=true')
Regardless of the connection, I am able to produce a table without using the original query (extract) I posted above. So the connection is fine, and a dataframe of the table I want is imported fine - its just with the additional script above I am unable to reproduce a table that can be produced in SSMS and Python's SQLalchemy.
You can try changing your ODBC drivers or using an alternative connection package: https://github.com/agstudy/rsqlserver, though I have not tested that one.
I would start though with rewriting your SQL code. Your using alias b inside a subquery while defining it the main query plus having a type conversion in the WHERE clause just to check empty string seems rather complicated. Can't you just CONCAT() the two strings? They seem to come from the same table Research....
Yes they're coming from the same table but I am actually concatenating across multiple rows so CONCAT won't help. I might try using COALESCE at some point, or STRING_AGG if available.
If you concatenate across multiple rows then a join it must be.
The MS SQL drivers can be a pain; when I had a project on MS SQL Server I spent more time googling for drivers than writing code and still ended up with strange workarounds. I was glad when it was over. Like the entire Microsoft walled garden you either love it or hate it, but there is little space for middle ground.