How many widgets have we transferred to acme this year?
Simple enough question right?
But then when you look at the data, each region works with acme’s local offices differently. Some transfer using one method, some offices mark the transfer in the system as “other firm”. Oh, and we don’t even get a data feed from the north west region because they still haven’t upgraded their shit so I can request a spreadsheet but it’s in a different format than everything else.
Then inevitably Acme has a different number of widgets that have been transfered. Because if a transfer gets kicked back or cancelled, it’s easier to just create a new transfer rather than go fix an old one because that process is laborious and requires tons of approvals so they just create a new transfer and send it over.
But yea, 20 minutes should be enough time to get you that before your meeting with Acme.
Same feel as “how long is this going to take to pull?” Well I don’t know if part of what you’re asking for exists, how clean it is, and if can join the data you’re talking about, so anywhere from 5 minutes to never?
That’s exactly how you should respond. I’ve been on the requester for some of these and if my team gave me that as a response I’d just say “let me know what you find out or when you know more.”
Man I don’t regret leaving this behind at my last job. You start out by doing someone a one-off like “sure I can pull the top 5 promotional GICs broken down by region for your blog article - I love supporting my co-workers!”
Then requests become increasingly esoteric and arcane, and insistent.
You try to build a simple FE to expose the data for them, but you can’t get the time approved so you either have to do it with OT or good ol’ time theft, and even then there’s no replacement for just writing SQL, so you’ll always be their silver bullet.
At that point you teach them how to do it themselves. Isn’t there a way to give them an account that only has read access so they can’t inadvertently screw up the database?
I like that idea, and it actually did work for our Marketing guy (Salesforce has a kind of SQL). Near the end there, I just had to debug a few of his harder errors, or double check a script that was going to be running on production.
Never thought of it for Postres or Mysql, etc, but I suppose there’s got to be an easy enough way to get someone access
In Oracle you’d just set up a user that has limited access and give them those credentials. Creating a few views that pulls in the data they want is a bonus.
I started coding with TurboBasic. My favorite thing about TB was that you could have variable names of any length but the compiler only used the first two letters - and case insensitive at that. So “Douchebag” and “doorknocker” looked like different variables but were actually the same thing.
A view is a saved query that pretends it’s a table. It doesn’t actually store any data. So if you need to query 10 different tables, joining them together and filtering the results specific ways, a view would just be that saved query, so instead of “SELECT * FROM (a big mess of tables)” you can do “SELECT * FROM HandyView”
Basically scripts you can run on the fly to pull calculated data. You can (mostly) treat them like tables themselves if you create them on the server.
So if you have repeat requests, you can save the view with maybe some broader parameters and then just SELECT * FROM [View_Schema].[My_View] WHERE [Year] = 2023 or whatever.
It can really slow things down if your views start calling other views in since they’re not actually tables. If you’ve got a view that you find you want to be calling in a lot of other views, you can try to extract as much of it as you can that isn’t updated live into a calculated table that’s updated by a stored procedure. Then set the stored procedure to run at a frequency that best captures the changes (usually daily). It can make a huge difference in runtime at the cost of storage space.
It can really slow things down if your views start calling other views in since they’re not actually tables
They can be in some cases! There’s a type of view called an “indexed” or “materialized” view where the view data is stored on disk like a regular table. It’s automatically recomputed whenever the source tables change. Doesn’t work well for tables that are very frequently updated, though.
Having said that, if you’re doing a lot of data aggregation (especially if it’s a sproc that runs daily), you’d probably want to set up a separate OLAP database so that large analytical queries don’t slow down transactional queries. With open-source technologies, this is usually using Hive and Presto or Spark combined with Apache Airflow.
Also, if you have data that’s usually aggregated by column, then a column-based database like Clickhouse is usually way faster than a regular row-based database. These store data per-column rather than per-row, so aggregating one column across millions or even billions of rows (eg average page load time for all hits ever recorded) is fast.
I hate these requests so fucking much. I’ve learned a lot of SQL because of it but I’m sick of it. Especially sick of the users who ask for the same data over and over again.
One guy asked me to run a report every first of the month and then he wouldn’t respond when I would send it so I stopped sending it. Additionally because he would request it AGAIN later in the month after I already sent it at the beginning of the month.
Guess it’s too much to search your fucking emails before requesting a new report to be run. A report that I’ve told you countless times will slow down everything for everyone else who’s using the system.
But tHis iS uRgENt aSAp to run a report asking for all data for the last 3 years.
You forgot the 3 paragraph WHERE clause to get every data point of a wednesday of an even year of a person who stubbed their toes on a roomba in their parent’s basement.
And the data they want is the entire FY, is 3,000,000 records and they need every single data attribute making the file like 250 MBs. Then you put it in their SharePoint and they get mad they can’t just view it in the browser despite the giant “This file is too large to view online, download it” message.
Newspaper: Hackers are announcing a trove of personal data leaked from [company] after a forwarded spreadsheet inadvertently contained more data than the sender realised.
programmer_humor
Active
This magazine is from a federated server and may be incomplete. Browse more on the original instance.