Reguła automatyzacji Jira podczas scalania pull requestu
Posted by: AJ Welch
As with most relational databases, there may often be situations where you need to combine the results of multiple queries into one single dataset when using Google BigQuery. Typically in BigQuery, this occurs when you’re gathering data from multiple tables or even across datasets, and this is where the power of using a UNION
comes into play.
In this tutorial we’ll examine uniting results in BigQuery using both the default Legacy SQL syntax as well as the optional Standard SQL syntax.
Nie ograniczanie się do Agile
The default syntax of Legacy SQL in BigQuery makes uniting results rather simple. In fact, all it requires at the most basic level is listing the various tables
in a comma-delimited list within the FROM
clause.
For example, assuming all data sources contain identical columns, we can query three different tables in the gdelt-bq:hathitrustbooks
dataset and combine the result set with the following query:
SELECT
BookMeta_Identifier,
BookMeta_Date,
BookMeta_Title
FROM
[gdelt-bq:hathitrustbooks.1920],
[gdelt-bq:hathitrustbooks.1921],
[gdelt-bq:hathitrustbooks.1922]
ORDER BY
BookMeta_Identifier DESC
LIMIT
1000
Since each of the tables contain the same columns and in the same order, we don’t need to specify anything extra in either the SELECT
clause nor the filter options that follow, and yet BigQuery is intelligent enough to translate this query into a UNION ALL
to combine all the results into one dataset.
The gathered results from all three tables are ordered, as intended, by the BookMeta_Identifier
column:
[
{
"BookMeta_Identifier": "yul.11995746_000_00",
"BookMeta_Date": "1922",
"BookMeta_Title": "Foundations a statement of Christian belief in terms of modern thought: by seven Oxford men."
},
{
"BookMeta_Identifier": "yul.11729715_000_00",
"BookMeta_Date": "1922",
"BookMeta_Title": "Frankenstein or, The modern Prometheus, by Mary W. Shelley."
},
{
"BookMeta_Identifier": "yul.11368340_000_00",
"BookMeta_Date": "1920",
"BookMeta_Title": "Pilgrim tercentenary, 1620-1920 "
},
{
"BookMeta_Identifier": "yul.11165435_000_00",
"BookMeta_Date": "1922",
"BookMeta_Title": "Social and diplomatic memories, 1884-1893 by the Right Hon. Sir James Rennell Rodd ..."
},
{
"BookMeta_Identifier": "yale.39002044555168",
"BookMeta_Date": "1921",
"BookMeta_Title": "South India and her Muhammadan invaders, by S. Krishnaswami Aiyangar."
},
...
]
Nie ograniczanie się do Agile
While the comma-delimited union method seen above for Legacy SQL is convenient, using the Standard SQL option with Google BigQuery requires a more verbose (yet also more familiar) method when combining result sets.
Instead of using comma-delimation, we must revert back to the normal use of the UNION
option (followed by the ALL
or DISTINCT
keyword, as appropriate). For example, to perform the exact same uniting query on the gdelt-bq:hathitrustbooks
dataset as seen above, our full query will look like this:
SELECT
BookMeta_Identifier,
BookMeta_Date,
BookMeta_Title
FROM
`gdelt-bq.hathitrustbooks.1920`
UNION ALL
SELECT
BookMeta_Identifier,
BookMeta_Date,
BookMeta_Title
FROM
`gdelt-bq.hathitrustbooks.1921`
UNION ALL
SELECT
BookMeta_Identifier,
BookMeta_Date,
BookMeta_Title
FROM
`gdelt-bq.hathitrustbooks.1922`
ORDER BY
BookMeta_Identifier DESC
LIMIT
1000
While the spacing and indentation in the example above is irrelevant, it better illustrates what we’re doing: Taking the results of (3) unique queries, uniting them via (2) UNION ALL
options, and then sorting and limiting the result set afterward.
As expected, the final united results are identical to the comma-delimited method using Legacy SQL:
[
{
"BookMeta_Identifier": "yul.11995746_000_00",
"BookMeta_Date": "1922",
"BookMeta_Title": "Foundations a statement of Christian belief in terms of modern thought: by seven Oxford men."
},
{
"BookMeta_Identifier": "yul.11729715_000_00",
"BookMeta_Date": "1922",
"BookMeta_Title": "Frankenstein or, The modern Prometheus, by Mary W. Shelley."
},
{
"BookMeta_Identifier": "yul.11368340_000_00",
"BookMeta_Date": "1920",
"BookMeta_Title": "Pilgrim tercentenary, 1620-1920 "
},
{
"BookMeta_Identifier": "yul.11165435_000_00",
"BookMeta_Date": "1922",
"BookMeta_Title": "Social and diplomatic memories, 1884-1893 by the Right Hon. Sir James Rennell Rodd ..."
},
{
"BookMeta_Identifier": "yale.39002044555168",
"BookMeta_Date": "1921",
"BookMeta_Title": "South India and her Muhammadan invaders, by S. Krishnaswami Aiyangar."
},
...
]