Data From S3

AWS Configuration

  1. Select IAM Service
  2. Create User and Copy the Access Key ID and Secret Access Key
  3. Create Group and attach Policy(AmazonS3FullAccess).
  4. Create S3 Bucket

Spark application in Scala

    val sc = new SparkContext(new SparkConf().setAppName("Recommender").setMaster("local[2]"))
    val hadoopConf = sc.hadoopConfiguration
    hadoopConf.set("fs.s3n.awsAccessKeyId", "***")
    hadoopConf.set("fs.s3n.awsSecretAccessKey", "***")
    val base = "s3n://lin-spark-sample/ch3/"
    val rawUserArtistData = sc.textFile(base + "user_artist_data.txt")

Using the s3n scheme