Building Recommendation Engines with PySpark

Brought by: DataCamp

Overview

Learn tools and techniques to leverage your own big data to facilitate positive experiences for your users.

This course will show you how to build recommendation engines using Alternating Least Squares in PySpark. Using the popular MovieLens dataset and the Million Songs dataset, this course will take you step by step through the intuition of the Alternating Least Squares algorithm as well as the code to train, test and implement ALS models on various types of customer data.

Syllabus

Recommendations Are Everywhere
-This chapter will show you how powerful recommendations engines can be, and provide important distinctions between collaborative-filtering engines and content-based engines as well as the different types of implicit and explicit data that recommendation engines can use. You will also learn a very powerful way to uncover hidden features (latent features) that you may not even know exist in customer datasets.

How does ALS work?
-In this chapter you will review basic concepts of matrix multiplication and matrix factorization, and dive into how the Alternating Least Squares algorithm works and what arguments and hyperparameters it uses to return the best recommendations possible. You will also learn important techniques for properly preparing your data for ALS in Spark.

Recommending Movies
-In this chapter you will be introduced to the MovieLens dataset. You will walk through how to assess it's use for ALS, build out a full cross-validated ALS model on it, and learn how to evaluate it's performance. This will be the foundation for all subsequent ALS models you build using Pyspark.

What if you don't have customer ratings?
-In most real-life situations, you won't not have "perfect" customer data available to build an ALS model. This chapter will teach you how to use your customer behavior data to "infer" customer ratings and use those inferred ratings to build an ALS recommendation engine. Using the Million Songs Dataset as well as another version of the MovieLens dataset, this chapter will show you how to use the data available to you to build a recommendation engine using ALS and evaluate it's performance.

Taught by

Jamen Long

Building Recommendation Engines with PySpark
Go to course

Building Recommendation Engines with PySpark

Brought by: DataCamp

  • DataCamp
  • Paid
  • English
  • Certificate Available
  • Available at any time
  • All
  • N/A
8.1.2PHP Version930msRequest Duration2MBMemory UsageGET en/courses/{slug}Route
    • Booting (594ms)
    • Application (332ms)
    • 1 x Booting (63.91%)
      594.23ms
      1 x Application (35.75%)
      332.41ms
      14 templates were rendered
      • public.courses.show (resources/views/public/courses/show.blade.php)3bladefile
        Params
        0
        course
        1
        links
        2
        config
      • public.courses.partials.breadcrumbs (resources/views/public/courses/partials/breadcrumbs.blade.php)6bladefile
        Params
        0
        __env
        1
        app
        2
        errors
        3
        course
        4
        links
        5
        config
      • public.courses.partials.heading (resources/views/public/courses/partials/heading.blade.php)7bladefile
        Params
        0
        __env
        1
        app
        2
        errors
        3
        course
        4
        links
        5
        config
        6
        classes
      • public.courses.partials.details (resources/views/public/courses/partials/details.blade.php)6bladefile
        Params
        0
        __env
        1
        app
        2
        errors
        3
        course
        4
        links
        5
        config
      • public.courses.partials.breadcrumbs (resources/views/public/courses/partials/breadcrumbs.blade.php)6bladefile
        Params
        0
        __env
        1
        app
        2
        errors
        3
        course
        4
        links
        5
        config
      • public.courses.partials.heading (resources/views/public/courses/partials/heading.blade.php)7bladefile
        Params
        0
        __env
        1
        app
        2
        errors
        3
        course
        4
        links
        5
        config
        6
        classes
      • public.layouts.main (resources/views/public/layouts/main.blade.php)6bladefile
        Params
        0
        __env
        1
        app
        2
        errors
        3
        course
        4
        links
        5
        config
      • public.layouts.partials.meta (resources/views/public/layouts/partials/meta.blade.php)6bladefile
        Params
        0
        __env
        1
        app
        2
        errors
        3
        course
        4
        links
        5
        config
      • public.layouts.partials.navbar (resources/views/public/layouts/partials/navbar.blade.php)6bladefile
        Params
        0
        __env
        1
        app
        2
        errors
        3
        course
        4
        links
        5
        config
      • public.auth.profile.partials.links (resources/views/public/auth/profile/partials/links.blade.php)6bladefile
        Params
        0
        __env
        1
        app
        2
        errors
        3
        course
        4
        links
        5
        config
      • public.auth.profile.partials.link (resources/views/public/auth/profile/partials/link.blade.php)8bladefile
        Params
        0
        __env
        1
        app
        2
        errors
        3
        course
        4
        links
        5
        config
        6
        route
        7
        title
      • public.auth.profile.partials.link (resources/views/public/auth/profile/partials/link.blade.php)8bladefile
        Params
        0
        __env
        1
        app
        2
        errors
        3
        course
        4
        links
        5
        config
        6
        route
        7
        title
      • public.auth.profile.partials.link (resources/views/public/auth/profile/partials/link.blade.php)8bladefile
        Params
        0
        __env
        1
        app
        2
        errors
        3
        course
        4
        links
        5
        config
        6
        route
        7
        title
      • public.layouts.partials.flash-session (resources/views/public/layouts/partials/flash-session.blade.php)6bladefile
        Params
        0
        __env
        1
        app
        2
        errors
        3
        course
        4
        links
        5
        config
      uri
      GET en/courses/{slug}
      middleware
      web, localize:en
      controller
      App\Http\Controllers\CourseController@show
      as
      en.courses.show
      namespace
      prefix
      /en
      where
      file
      app/Http/Controllers/CourseController.php:17-35
      6 statements were executed15.91ms
      • select * from `courses` where `slug_en` = 'building-recommendation-engines-with-pyspark' limit 1
        14.44ms/app/Http/Controllers/CourseController.php:20corspedia
        Metadata
        Bindings
        • 0. building-recommendation-engines-with-pyspark
        Backtrace
        • 17. /app/Http/Controllers/CourseController.php:20
        • 18. /vendor/laravel/framework/src/Illuminate/Routing/Controller.php:54
        • 19. /vendor/laravel/framework/src/Illuminate/Routing/ControllerDispatcher.php:43
        • 20. /vendor/laravel/framework/src/Illuminate/Routing/Route.php:260
        • 21. /vendor/laravel/framework/src/Illuminate/Routing/Route.php:205
      • update `courses` set `visitors` = `visitors` + 1, `courses`.`updated_at` = '2025-06-20 23:17:12' where `id` = 4003
        580μs/app/Http/Controllers/CourseController.php:21corspedia
        Metadata
        Bindings
        • 0. 2025-06-20 23:17:12
        • 1. 4003
        Backtrace
        • 17. /app/Http/Controllers/CourseController.php:21
        • 18. /vendor/laravel/framework/src/Illuminate/Routing/Controller.php:54
        • 19. /vendor/laravel/framework/src/Illuminate/Routing/ControllerDispatcher.php:43
        • 20. /vendor/laravel/framework/src/Illuminate/Routing/Route.php:260
        • 21. /vendor/laravel/framework/src/Illuminate/Routing/Route.php:205
      • select `id`, `name_en`, `name_ar`, `topic_id`, `slug_en`, `slug_ar` from `subjects` where `subjects`.`id` in (58)
        220μs/app/Http/Controllers/CourseController.php:23corspedia
        Metadata
        Backtrace
        • 20. /app/Http/Controllers/CourseController.php:23
        • 21. /vendor/laravel/framework/src/Illuminate/Routing/Controller.php:54
        • 22. /vendor/laravel/framework/src/Illuminate/Routing/ControllerDispatcher.php:43
        • 23. /vendor/laravel/framework/src/Illuminate/Routing/Route.php:260
        • 24. /vendor/laravel/framework/src/Illuminate/Routing/Route.php:205
      • select `id`, `name_en`, `name_ar`, `slug_en`, `slug_ar` from `topics` where `topics`.`id` in (1)
        180μs/app/Http/Controllers/CourseController.php:23corspedia
        Metadata
        Backtrace
        • 25. /app/Http/Controllers/CourseController.php:23
        • 26. /vendor/laravel/framework/src/Illuminate/Routing/Controller.php:54
        • 27. /vendor/laravel/framework/src/Illuminate/Routing/ControllerDispatcher.php:43
        • 28. /vendor/laravel/framework/src/Illuminate/Routing/Route.php:260
        • 29. /vendor/laravel/framework/src/Illuminate/Routing/Route.php:205
      • select * from `providers` where `providers`.`id` in (58) and `providers`.`deleted_at` is null
        220μs/app/Http/Controllers/CourseController.php:23corspedia
        Metadata
        Backtrace
        • 20. /app/Http/Controllers/CourseController.php:23
        • 21. /vendor/laravel/framework/src/Illuminate/Routing/Controller.php:54
        • 22. /vendor/laravel/framework/src/Illuminate/Routing/ControllerDispatcher.php:43
        • 23. /vendor/laravel/framework/src/Illuminate/Routing/Route.php:260
        • 24. /vendor/laravel/framework/src/Illuminate/Routing/Route.php:205
      • select * from `html_files` where `html_files`.`id` = 3994 limit 1
        270μs/app/Models/Course.php:84corspedia
        Metadata
        Bindings
        • 0. 3994
        Backtrace
        • 21. /app/Models/Course.php:84
        • 28. view::public.courses.show:29
        • 30. /vendor/laravel/framework/src/Illuminate/Filesystem/Filesystem.php:125
        • 31. /vendor/laravel/framework/src/Illuminate/View/Engines/PhpEngine.php:58
        • 32. /vendor/laravel/framework/src/Illuminate/View/Engines/CompilerEngine.php:72
      App\Models\HtmlFile
      1
      App\Models\Provider
      1
      App\Models\Topic
      1
      App\Models\Subject
      1
      App\Models\Course
      1
        _token
        kfrkrGKIsnlf62vX3m0izT1QtUuHd9yOiKG9EsyW
        locale
        en
        _previous
        array:1 [ "url" => "https://www.corspedia.com/en/courses/building-recommendation-engines-with-pysp...
        _flash
        array:2 [ "old" => [] "new" => [] ]
        PHPDEBUGBAR_STACK_DATA
        []
        path_info
        /en/courses/building-recommendation-engines-with-pyspark
        status_code
        200
        
        status_text
        OK
        format
        html
        content_type
        text/html; charset=UTF-8
        request_query
        []
        
        request_request
        []
        
        request_headers
        0 of 0
        array:24 [ "cf-ipcountry" => array:1 [ 0 => "US" ] "cf-connecting-ip" => array:1 [ 0 => "216.73.216.0" ] "cdn-loop" => array:1 [ 0 => "cloudflare; loops=1" ] "x-forwarded-proto" => array:1 [ 0 => "https" ] "x-forwarded-for" => array:1 [ 0 => "216.73.216.0" ] "sec-fetch-site" => array:1 [ 0 => "none" ] "accept" => array:1 [ 0 => "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7" ] "user-agent" => array:1 [ 0 => "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)" ] "upgrade-insecure-requests" => array:1 [ 0 => "1" ] "sec-ch-ua-platform" => array:1 [ 0 => ""Windows"" ] "sec-ch-ua-mobile" => array:1 [ 0 => "?0" ] "sec-ch-ua" => array:1 [ 0 => ""Chromium";v="130", "HeadlessChrome";v="130", "Not?A_Brand";v="99"" ] "cache-control" => array:1 [ 0 => "no-cache" ] "pragma" => array:1 [ 0 => "no-cache" ] "sec-fetch-dest" => array:1 [ 0 => "document" ] "cf-ray" => array:1 [ 0 => "952efa6c7994eb58-ORD" ] "accept-encoding" => array:1 [ 0 => "gzip, br" ] "priority" => array:1 [ 0 => "u=0, i" ] "sec-fetch-user" => array:1 [ 0 => "?1" ] "sec-fetch-mode" => array:1 [ 0 => "navigate" ] "cf-visitor" => array:1 [ 0 => "{"scheme":"https"}" ] "host" => array:1 [ 0 => "www.corspedia.com" ] "content-length" => array:1 [ 0 => "" ] "content-type" => array:1 [ 0 => "" ] ]
        request_server
        0 of 0
        array:50 [ "USER" => "www-data" "HOME" => "/var/www" "HTTP_CF_IPCOUNTRY" => "US" "HTTP_CF_CONNECTING_IP" => "216.73.216.0" "HTTP_CDN_LOOP" => "cloudflare; loops=1" "HTTP_X_FORWARDED_PROTO" => "https" "HTTP_X_FORWARDED_FOR" => "216.73.216.0" "HTTP_SEC_FETCH_SITE" => "none" "HTTP_ACCEPT" => "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7" "HTTP_USER_AGENT" => "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)" "HTTP_UPGRADE_INSECURE_REQUESTS" => "1" "HTTP_SEC_CH_UA_PLATFORM" => ""Windows"" "HTTP_SEC_CH_UA_MOBILE" => "?0" "HTTP_SEC_CH_UA" => ""Chromium";v="130", "HeadlessChrome";v="130", "Not?A_Brand";v="99"" "HTTP_CACHE_CONTROL" => "no-cache" "HTTP_PRAGMA" => "no-cache" "HTTP_SEC_FETCH_DEST" => "document" "HTTP_CF_RAY" => "952efa6c7994eb58-ORD" "HTTP_ACCEPT_ENCODING" => "gzip, br" "HTTP_PRIORITY" => "u=0, i" "HTTP_SEC_FETCH_USER" => "?1" "HTTP_SEC_FETCH_MODE" => "navigate" "HTTP_CF_VISITOR" => "{"scheme":"https"}" "HTTP_HOST" => "www.corspedia.com" "REDIRECT_STATUS" => "200" "SERVER_NAME" => "corspedia.com" "SERVER_PORT" => "443" "SERVER_ADDR" => "141.95.147.152" "REMOTE_USER" => "" "REMOTE_PORT" => "49980" "REMOTE_ADDR" => "172.70.100.189" "SERVER_SOFTWARE" => "nginx/1.18.0" "GATEWAY_INTERFACE" => "CGI/1.1" "HTTPS" => "on" "REQUEST_SCHEME" => "https" "SERVER_PROTOCOL" => "HTTP/2.0" "DOCUMENT_ROOT" => "/var/www/corspedia/public" "DOCUMENT_URI" => "/index.php" "REQUEST_URI" => "/en/courses/building-recommendation-engines-with-pyspark" "SCRIPT_NAME" => "/index.php" "CONTENT_LENGTH" => "" "CONTENT_TYPE" => "" "REQUEST_METHOD" => "GET" "QUERY_STRING" => "" "SCRIPT_FILENAME" => "/var/www/corspedia/public/index.php" "PATH_INFO" => "" "FCGI_ROLE" => "RESPONDER" "PHP_SELF" => "/index.php" "REQUEST_TIME_FLOAT" => 1750461432.03 "REQUEST_TIME" => 1750461432 ]
        request_cookies
        []
        
        response_headers
        0 of 0
        array:5 [ "content-type" => array:1 [ 0 => "text/html; charset=UTF-8" ] "cache-control" => array:1 [ 0 => "no-cache, private" ] "date" => array:1 [ 0 => "Fri, 20 Jun 2025 23:17:12 GMT" ] "set-cookie" => array:2 [ 0 => "XSRF-TOKEN=eyJpdiI6IndRdWFQRDlMdU5UZWxNYnlTaVBqNmc9PSIsInZhbHVlIjoia2dQZjc1YW5aTFU1QURVNDNDbEtXZFI4anVHbGVZWUszNjZKMStvUTBKNGVLSk8xY3MyTVZPTDl4REkzUWRzVXhRUDBWUmg5YUVDTUk1WFljbHFwL2xhY3I5NkNnR2x2TkFzMlJDT0VwajQvdFFvbEgxOUFFL3ZsZEJ3QkdzS24iLCJtYWMiOiJiYzYwOGI5YjAzZjQ4OTYyNmE5OTVjYzQ1YTYzZWIyYTlkZjQxYTJjMDgwYzYxYmM3MWE5ZGQ2ZGFlMmUxNGNmIiwidGFnIjoiIn0%3D; expires=Sat, 21 Jun 2025 01:17:12 GMT; Max-Age=7200; path=/; samesite=laxXSRF-TOKEN=eyJpdiI6IndRdWFQRDlMdU5UZWxNYnlTaVBqNmc9PSIsInZhbHVlIjoia2dQZjc1YW5aTFU1QURVNDNDbEtXZFI4anVHbGVZWUszNjZKMStvUTBKNGVLSk8xY3MyTVZPTDl4REkzUWRzVXhRUDBWU" 1 => "laravel_session=eyJpdiI6InBUUFpuVU04d1hWOE9XeTZHUElmM0E9PSIsInZhbHVlIjoiU2dSSGVrK3RxSGNqb0hWWWZKWTgzMTNjVk5NeENqUEZWM0FsWUxUUGRkRDFaS005UjdmQVhobXlXcHBFOHQ1YWhsd3BNNk4wRk91RktvWlZMSWZYK2phK0hBbXZaZkF4M2cyRlNHQ2Zucmo1VldKVFpZb3liQ2x2RGlRNStXbWkiLCJtYWMiOiJhMjU2M2U5MWMzNjRkODQxNjVjOTBiZmYxNTdlN2M1OGFmNDlkNGE1MWJlZWM3NTk3N2JkNmU5YmNlMTNkNjllIiwidGFnIjoiIn0%3D; expires=Sat, 21 Jun 2025 01:17:12 GMT; Max-Age=7200; path=/; httponly; samesite=laxlaravel_session=eyJpdiI6InBUUFpuVU04d1hWOE9XeTZHUElmM0E9PSIsInZhbHVlIjoiU2dSSGVrK3RxSGNqb0hWWWZKWTgzMTNjVk5NeENqUEZWM0FsWUxUUGRkRDFaS005UjdmQVhobXlXcHBFOHQ1YWhs" ] "Set-Cookie" => array:2 [ 0 => "XSRF-TOKEN=eyJpdiI6IndRdWFQRDlMdU5UZWxNYnlTaVBqNmc9PSIsInZhbHVlIjoia2dQZjc1YW5aTFU1QURVNDNDbEtXZFI4anVHbGVZWUszNjZKMStvUTBKNGVLSk8xY3MyTVZPTDl4REkzUWRzVXhRUDBWUmg5YUVDTUk1WFljbHFwL2xhY3I5NkNnR2x2TkFzMlJDT0VwajQvdFFvbEgxOUFFL3ZsZEJ3QkdzS24iLCJtYWMiOiJiYzYwOGI5YjAzZjQ4OTYyNmE5OTVjYzQ1YTYzZWIyYTlkZjQxYTJjMDgwYzYxYmM3MWE5ZGQ2ZGFlMmUxNGNmIiwidGFnIjoiIn0%3D; expires=Sat, 21-Jun-2025 01:17:12 GMT; path=/XSRF-TOKEN=eyJpdiI6IndRdWFQRDlMdU5UZWxNYnlTaVBqNmc9PSIsInZhbHVlIjoia2dQZjc1YW5aTFU1QURVNDNDbEtXZFI4anVHbGVZWUszNjZKMStvUTBKNGVLSk8xY3MyTVZPTDl4REkzUWRzVXhRUDBWU" 1 => "laravel_session=eyJpdiI6InBUUFpuVU04d1hWOE9XeTZHUElmM0E9PSIsInZhbHVlIjoiU2dSSGVrK3RxSGNqb0hWWWZKWTgzMTNjVk5NeENqUEZWM0FsWUxUUGRkRDFaS005UjdmQVhobXlXcHBFOHQ1YWhsd3BNNk4wRk91RktvWlZMSWZYK2phK0hBbXZaZkF4M2cyRlNHQ2Zucmo1VldKVFpZb3liQ2x2RGlRNStXbWkiLCJtYWMiOiJhMjU2M2U5MWMzNjRkODQxNjVjOTBiZmYxNTdlN2M1OGFmNDlkNGE1MWJlZWM3NTk3N2JkNmU5YmNlMTNkNjllIiwidGFnIjoiIn0%3D; expires=Sat, 21-Jun-2025 01:17:12 GMT; path=/; httponlylaravel_session=eyJpdiI6InBUUFpuVU04d1hWOE9XeTZHUElmM0E9PSIsInZhbHVlIjoiU2dSSGVrK3RxSGNqb0hWWWZKWTgzMTNjVk5NeENqUEZWM0FsWUxUUGRkRDFaS005UjdmQVhobXlXcHBFOHQ1YWhs" ] ]
        session_attributes
        0 of 0
        array:5 [ "_token" => "kfrkrGKIsnlf62vX3m0izT1QtUuHd9yOiKG9EsyW" "locale" => "en" "_previous" => array:1 [ "url" => "https://www.corspedia.com/en/courses/building-recommendation-engines-with-pyspark" ] "_flash" => array:2 [ "old" => [] "new" => [] ] "PHPDEBUGBAR_STACK_DATA" => [] ]