Create & Populate StackOverflow.dbo.PostTags table

Spread the love

As a DBA & instructor, very often I like to generate heavy workload on my SQL Server instances to re-produce scenarios like 100% CPU, zero PLE, high IO, etc. For these activities, I find StackOverflow dump very handy.

In case you are not aware of StackOverflow dump, and want to start fresh, below links might help:-

— 01) Common Queries
https://data.stackexchange.com/stackoverflow/queries

— 02) Brent Introduction to StackOverflow Schema
https://www.brentozar.com/archive/2018/02/gentle-introduction-stack-overflow-schema/

— 03) Database schema documentation for the public data dump and SEDE
https://meta.stackexchange.com/questions/2677/database-schema-documentation-for-the-public-data-dump-and-sede

— 04) Query Plan Operators
https://sqlserverfast.com/epr/operator-list/

— 05) Database for practice
https://www.brentozar.com/archive/2018/01/updated-and-smaller-stack-overflow-demo-databases/
https://www.brentozar.com/archive/2015/10/how-to-download-the-stack-overflow-database-via-bittorrent/

Very recently, BrentOzar provided Extra-Large version of StackOverflow database of 381GB size from stack dump of 2020/06: 46GB torrent (magnet).

But, StackOverflow dump does not include separate dump for table dbo.PostTags which is very commonly utilized in various queries. Rather, the same data come as part of column Tags in table dbo.Posts.

In order to create & populate a separate table StackOverflow.dbo.PostTags, I wrote the below query utilizing STRING_SPLIT function which was added in SQL 2016.

With all the required tables, you can start writing/utilizing common queries to generate/test SQL Server performance issues.

I hope this has been helpful. Happy Coding 🙂

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.