资源下载
•数据分析, BI 以及可视化:了解最新的数据分析、BI 和可视化技术以及客户和社区的解决方案。
•数据工程:从实现数据管道到管理数据质量、ETL和数据质量框架再到数据 ops,深入了解最新的数据工程知识。
•Data Lakes, Data Warehouses and Data Lakehouses:了解数据湖和数据仓库演变为 Data Lakehouses 背后的概念和最佳实践;
•数据科学, 机器学习以及 MLOps:了解关于生产数据科学和机器学习管道的技术和最佳实践。
•数据安全和治理: •学术研究:致力于学术和先进的工业研究领域,包括大规模调度程序,图表,数据分析和机器学习系统。
资源目录清单
2022.06.29.Deep Dive into the New Features of Apache Spark 3.2 and 3.3
A Modern Approach to Big Data for Finance_iteblog
A-Practitioner’s-Guide-to-Unity-Catalog_A-Technical-Deep-Dive_iteblog
Accelerating-the-Pace-of-Autism-Diagnosis-with-Machine-Learning-Models_iteblog
Achieve-Machine-Learning-Hyper-Productivity-with-Transformers-and-Hugging-Face_iteblog
Administrator-Best-Practices-and-Tips-for-Future-Proofing-your-Databricks-Account_iteblog
Advanced-Migrations_From-Hive-to-SparkSQL_iteblog
Agile Data Engineering_Reliability and Continuous Delivery at Scale_iteblog
AI-Fueled-Forecasting_The-Next-Generation-of-Financial-Planning_iteblog
AI-powered-Assortment-Planning-Solution_iteblog
ALaSpark-Gousto-Recipe-for-Building-Scalable-PySpark-Pipelines_iteblog
Amgen’s Journey To Building a Global 360 View of its Customers with the Lakehouse_iteblog
An Advanced S3 Connector for Spark to Hunt for Cyber Attacks_iteblog
Apache Arrow Flight SQL_High Performance, Simplicity, and Interoperability for Data Transfers_iteblog
Apache Spark on Kubernetes—Lessons Learned from Launching Millions of Spark Executors_iteblog
Apache Spark SQL Aggregate Improvement at Meta (Facebook)_iteblog
Apache-Spark-AQE-SkewedJoin-Optimization-and-Practice-in-ByteDance_iteblog
Applied Predictive Maintenance in Aviation_Without Sensor Data_iteblog
Auto Encoder Decoder-Based Anomaly Detection with the Lakehouse Paradigm_iteblog
Automate Your Delta Lake or Practical Insights on Building Distributed Data Mesh_iteblog
Automating Model Lifecycle Orchestration with Jenkins_iteblog
Automating-Business-Decisions-Using-Event-Streams_iteblog
Backfill Streaming Data Pipelines in Kappa Architecture_iteblog
Best-Practices-of-Maintaining-High-Quality-Data_iteblog
Big Data in the Age of Moneyball_iteblog
Build an Enterprise Lakehouse for Free with Trino and Delta Lake_iteblog
Building a Data Science as a Service platform in Azure with Databricks_iteblog
Building a Lakehouse for Data Science at DoorDash_iteblog
Building an Analytics Lakehouse at Grab_iteblog
Building and Scaling Machine Learning-Based Products in the World’s Largest Brewery_iteblog
Building Enterprise Scale Data and Analytics Platforms at Amgen_iteblog
Building Metadata and Lineage Driven Pipelines on Kubernetes_iteblog
Building Production-Ready Recommender Systems with Feature Stores_iteblog
Building-an-Operational-Machine-Learning-Organization-from-Zero-and-Leveraging-ML-for-Crypto-Security_iteblog
Building-Spatial-Applications-with-Apache-Spark-and-CARTO_iteblog
Case Study-in-Rearchitecting-an-On-Premises-Pipeline-in-the-Cloud_iteblog
Challenges-in-Time-Series-Forecasting_iteblog
Chaos-Engineering-in-the-World-of-Large-Scale-Complex-Data-Flow_iteblog
Cloud-and-Data-Science-Modernization-of-Veterans-Affairs-Financial-Service-Center-with-Azure-Databricks_iteblog
Cloud-Native-Geospatial-Analytics-at-JLL_iteblog
Computational-Data-Governance-at-Scale_iteblog
Connecting-the-Dots-with-DataHub_Lakehouse-and-Beyond_iteblog
Coral-and-Transport_Portable-SQL-and-UDFs-for-the-Interoperability-of-Spark-and-Other-Engines_iteblog
Correlation-Over-Causation_Cracking-the-Relationship-Between-User-Engagement-and-User-Happiness_iteblog
Customer-centric-Innovation-to-Scale-Data-AI-Everywhere_iteblog
Cutting-the-Edge-in-Fighting-Cybercrime_Reverse-Engineering-a-Search-Language-to-Cross-Compile-it-to-PySpark_iteblog
Data-Boards_A-Collaborative-and-Interactive-Space-for-Data-Science_iteblog
Data-Centric-Principles-for-AI-Engineering_iteblog
Data-Lakehouse-and-Data-Mesh_Two-Sides-of-the-Same-Coin_iteblog
Databricks-and-Enterprise-Observability-with-Overwatch_iteblog
Databricks-Meets-Power-BI_iteblog
DataFusion-and-Arrow_Supercharge-Your-Data-Analytical-Tool-with-a-Rusty-Query-Engine_iteblog
DBA-Perspective_Optimizing-Performance-Table-by-Table_iteblog
dbt-and-Python-Better-Together_iteblog
dbt-Machine-Learning_What-Makes-a-Great-Baton-Pass_iteblog
Deep-Dive-into-Delta-Lake_iteblog
Deep-Dive-into-the-New-Features-of-Apache-Spark_iteblog
Delta-Lake-Overview_iteblog
Delta-Sharing-A-New-Paradigm-for-Secure-Data-Sharing-and-Data-Collaboration-on-Lakehouse_iteblog
Democratizing-Metrics-at-Airbnb_iteblog
Designing-Better-MLOps-Systems_iteblog
Destination-Lakehouse_All-Your-Data-Analytics-and-AI-on-One-Platform_iteblog
Detecting-Financial-Crime-Using-an-Azure-Advanced-Analytics-Platform-and-MLOps-Approach_iteblog
Discover-Data-Lakehouse-With-End-to-End-Lineage_iteblog
Disrupting-the-Prescription-Drug-Market-with-AI-and-Data_iteblog
Distributed-Machine-Learning-at-Lyft_iteblog
Doubling-the-Capacity-of-the-Data-Platform-Without-Doubling-the-Cost_iteblog
Elixir_The-Wickedly-Awesome-Batch-and-Stream-Processing-Language-You-Should-Have-in-Your-Toolbox_iteblog
Embedding-Privacy-by-Design-Into-Data-Infrastructure-Through-Open-Source-Extensible-Tooling_iteblog
Enable-Production-ML-with-Databricks-Feature-Store_iteblog
Enabling-BI-in-a-Lakehouse-Environment_iteblog
Enabling-Learning-on-Confidential-Data_iteblog
Ensuring-Correct-Distributed-Writes-to-Delta-Lake-in-Rust-with-Formal-Verification_iteblog
Evolution-of-Data-Architectures-and-How-to-Build-a-Lakehouse_iteblog
Fugue-Tune_Distributed-Hybrid-Hyperparameter-Tuning_iteblog
GIS-Pipeline-Acceleration with-Apache-Sedona_iteblog
Git-for-Data-Lakes_How-lakeFS-Scales-Data-Versioning-to-Billions-of-Objects_iteblog
Hassle-Free-Data-Ingestion-into-the-Lakehouse_iteblog
How to Automate the Modernization and Migration of Your Data Warehousing Workloads to Databricks Lakehouse_iteblog
How-EPRI-Uses-Computer-Vision-to-Mitigate-Wildfire-Risks-for-Electric-Utilities_iteblog
How-Robinhood-Built-a-Streaming-Lakehouse-to-Bring-Data-Freshness-from-24h-to-Less-Than-15 Mins_iteblog
How-socat-and-UNIX-Pipes-Can-Help Data-Integration_iteblog
How-the-Largest-County-in-the-US-is-Transforming-Hiring-with-a-Modern-Data-Lakehouse_iteblog
How-to-Build-a-Complete-Security-and-Governance-Solution-Using-Unity-Catalog_iteblog
How-to-Implement-a-Semantic-Layer-for-Your-Lakehouse_iteblog
How-To-Make-Apache-Spark-on-Kubernetes-Run-Reliably-on-Spot-Instances_iteblog
How-To-Use-Databricks-SQL-for-Analytics-on-Your-Lakehouse_iteblog
Implementing-a-Framework-for-Data-Security-and-Policy-at-a-Large-Public-Sector-Agency_iteblog
Implementing-an-End-to-End-Demand-Forecasting-Solution-Through-Databricks-and-MLflow_iteblog
Implementing-Data-Governance-3.0-for-the-Lakehouse-Era_Community-Led-and-Bottom-Up_iteblog
Improving-Apache-Spark-Structured-Streaming-Application-Processing-Time_iteblog
Improving-Interactive-Querying-Experience-on-Spark-SQL_iteblog
Improving-patient-care-with-Databricks_iteblog
Ingesting-data-into-Lakehouse-with-COPY-INTO_iteblog
Integrating-Apache-Superset-into-a-B2B-Platform_Why-and-How_iteblog
Introducing-Zipline_An-Open-Source-Feature-Engineering-Platform_iteblog
Learn-to-Efficiently-Test-ETL-Pipelines_iteblog
Lessons-Learned-from-Deidentifying-700-Million-Patient-Notes_iteblog
Low-Code-Machine-Learning-on-Databricks-with-AutoML_iteblog
Mapping-Data-Quality-Concerns-to-Data-Lake-Zones_iteblog
Meshing About with Databricks_iteblog
Migrate-and-Modernize-your-Data-Platform-with-Confluent-and-Databricks_iteblog
Migrating-Complex-SAS-Processes-to-Databricks-Case-Study_iteblog
MLflow-Pipelines_Accelerating-MLOps-from-Development-to-Production_iteblog
MLOps-at-DoorDash_iteblog
Monitoring and Quality Assurance of Complex ML Deployments via Assertions_iteblog
Mosaic_A-Framework-for-Geospatial-Analytics-at-Scale_iteblog
Multimodal-Deep-Learning-Applied-to-E-commerce-Big-Data_iteblog
Near Real-Time Analytics with Event Streaming, Live Tables, and Delta Sharing_iteblog
Obfuscating-Sensitive-Information-from-Spark-UI-and-Logs_iteblog
Open-Source-Powers-the-Modern-Data-Stack_iteblog
Opening-the-Floodgates_Enabling-Fast-Unmediated-End-User-Access-to-Trillion-Row-Datasets-with-SQL-Data-Warehouses_iteblog
Optimizing-Speed-and-Scale-of-User-Facing-Analytics-Using-Apache-Kafka-and-Pinot_iteblog
Polars_Blazingly Fast DataFrames in Rust and Python_iteblog
Power-to-the-SQL-People_Python-UDFs-in-DBSQL_iteblog
Powering Up the Business with a Lakehouse_iteblog
Practical Data Governance in a Large Scale Databricks Environment_iteblog
Predicting Repeat Admissions to Substance Abuse Treatment with Machine Learning_iteblog
Presto On Spark_A Unified SQL Experience_iteblog
Privacy-Preserving-Machine-Learning-and-Big-Data-Analytics-Using-Apache-Spark_iteblog
Productionizing Ethical Credit Scoring Systems with Delta Lake, Feature Store and MLFlow_iteblog
Protecting Personally Identifiable Information (PII)_PHI Data in Data Lake via Column Level Encryption _iteblog
PySpark in Apache Spark 3.3 and Beyond_iteblog
Radical Speed on the Lakehouse_Photon Under the Hood_iteblog
Real-Time Search and Recommendation at Scale Using Embeddings and Hopsworks_iteblog
Real-Time-Cost-Reduction-Monitoring-and-Alerting_iteblog
Realize the Promise of Streaming with the Databricks Lakehouse Platform_iteblog
Recent-Parquet-Improvements-in-Apache-Spark_iteblog
Rethinking Orchestration as Reconciliation_Software-Defined Assets in Dagster_iteblog
Running a Low Cost, Versatile Data Management Ecosystem with Apache Spark at Core_iteblog
Scalable-XGBoost-on-GPU-Clusters_iteblog
Scaling AI Workloads with the Ray Ecosystem_iteblog
Scaling Your Workloads with Databricks Serverless_iteblog
Scaling-Deep-Learning-on-Databricks_iteblog
Scaling-ML-at-CashApp-with-Tecton_iteblog
Scaling-Privacy_Practical-Architectures-and-Experiences_iteblog
Security-Best-Practices-for-Lakehouse_iteblog
Self-Serve-Automated-and-Robust-CDC-pipeline-using-AWS-DMS-DynamoDB-Streams-and-Databricks-Delta_iteblog
Serverless-Kafka-and-Apache-Spark-in-a-Multi-Cloud-Data-Lakehouse-Architecture_iteblog
Serving-Near-Real-Time-Features-at-Scale_iteblog
Setting-up-On-Shelf-Availability-Alerts-at-Scale-with-Databricks-and-Azure_iteblog
Simplify-Global-DataOps-and-MLOps-Using-Oktas-FIG-Automation-Library_iteblog
Simplifying Migrations to Lakehouse—the Databricks Way_iteblog
Smart Manufacturing_Real-time Process Optimization with Databricks_iteblog
So-Fresh-and-So-Clean_Learn-How-to-Build-Real-Time-Warehouses-on-Lakehouse_iteblog
Sound-Data-Engineering-in-Rust_From-Bits to-DataFrames_iteblog
Spark-Data-Source-V2-Performance-Improvement_Aggregate-Push-Down_iteblog
Spark-Inception_Exploiting-the-Apache-Spark-REPL-to-Build-Streaming-Notebooks_iteblog
Spline_Central-Data-Lineage-Tracking-Not-Only-For-Spark_iteblog
State-of-the-Art-Natural-Language-Processing-with-Apache-Spark-NLP_iteblog
Streaming-ML-Enrichment-Framework-Using-Advanced-Delta-Table-Features_iteblog
Survey-of-Production-ML-Tech-Stacks_iteblog
Technical-and-Tactical-Football-Analysis-Through-Data_iteblog
The Databricks Notebook_Front Door of the Lakehouse_iteblog
The-Modern-Metadata-Platform_What-Why-and-How_iteblog
The-Road-to-a-Robust-Data-Lake_0_iteblog
The-Semantics-of-Biology_Vaccine-and-Drug-Research-with-Knowledge-Graphs-and-Logical-Inferencing-on-Apache-Spark_teblog
Time-Series-Forecasting-with-PyCaret_iteblog
Tools for Assisted Apache Spark Version Migrations, From 2.1 to 3.2+_iteblog
Towards-Dynamic-Microstructure_The-Role-of-Machine-Learning-in-the-Next-Generation-of-Exchanges_iteblog
Turning-Big-Biology-Data-into-Insights-on-Disease_The-Power-of-Circulating-Biomarkers_iteblog
Turning-Fan-Data-Into-an-Asset_iteblog
UIMeta-A-10X-Faster-Cloud-Native-Spark-History-Server_iteblog
Unifying-Data-Science-and-Business_iteblog
Vision-AI_Animal-Health-Industry-Use-Cases-Using-Databricks-on-Azure_iteblog
What-to-Do-When-Your-Job-Goes-OOM-in-the-Night_Flowcharts_iteblog
X-FIPE_eXtended-Feature-Impact-for-Prediction-Explanation_iteblog
You-Have-BI-Now-What_ Activate-Your-Data_iteblog
加入私享社群
本资源已更新至金融文库私享社群,马上加入,立即获得本站所有资源,更有料,更佳下载体验!
版权声明及安全提醒:本文转自网络平台,文章仅代表作者观点,不代表「金融文库」立场。相关版权归原作者所有,「金融文库」仅提供免费交流与学习,相关内容与材料请勿用于商业。我们感谢每一位原创作者的辛苦付出与创作,如本转载内容涉及版权及侵权问题,请及时联系我们客服处理(微信号:JRwenku8),谢谢!