Powered by GitBook

Data From S3

AWS Configuration

Select IAM Service
Create User and Copy the Access Key ID and Secret Access Key
Create Group and attach Policy(AmazonS3FullAccess).
Create S3 Bucket

Spark application in Scala

    val sc = new SparkContext(new SparkConf().setAppName("Recommender").setMaster("local[2]"))
    val hadoopConf = sc.hadoopConfiguration
    hadoopConf.set("fs.s3n.awsAccessKeyId", "***")
    hadoopConf.set("fs.s3n.awsSecretAccessKey", "***")
    val base = "s3n://lin-spark-sample/ch3/"
    val rawUserArtistData = sc.textFile(base + "user_artist_data.txt")

Using the s3n scheme