使用flume完成数据的接收
场景:source是通过tcp发送,chnnel处理过滤字段,sink存在集群中

适合①[注意,syslog需要特定环境,也可用telnet发送数据]

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
source[syslogtcp],sink[hdfs]
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = syslogtcp
a1.sources.r1.port = 12345
a1.sources.r1.host =hadoop01
a1.sources.r1.channels = c1
# Describe the sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.channel = c1
###HDFS的数目路径
a1.sinks.k1.hdfs.path = hdfs://hadoop01:9000/flume
a1.sinks.k1.hdfs.filePrefix = Syslog
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 1
a1.sinks.k1.hdfs.roundUnit = minute
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
[hadoop@hadoop01 flume]$ start-all.sh
[hadoop@hadoop01 flume]$ hadoop fs -mkdir flume
[hadoop@hadoop01 flume]$ hadoop fs -ls
drwxr-xr-x - hadoop supergroup 0 2017-03-12 17:14 flume
接收端:bin/flume-ng agent --conf conf --conf-file conf/syslog.conf --name a1 -Dflume.root.logger=INFO,console
发送端:telnet hadoop01 12345,,,
结果:
[hadoop@hadoop01 flume]$ hadoop fs -ls /flume //注意在hadoop下面写文件查看时,文件夹要加“/”
Found 13 items
-rw-r--r-- 3 hadoop supergroup 177 2017-03-12 18:09 /flume/My_netcat_log.1489313346930
-rw-r--r-- 3 hadoop supergroup 224 2017-03-12 18:16 /flume/My_netcat_log.1489313794747
-rw-r--r-- 3 hadoop supergroup 185 2017-03-12 17:21 /flume/Syslog.1489310474526
-rw-r--r-- 3 hadoop supergroup 149 2017-03-12 17:21 /flume/Syslog.1489310474527
[hadoop@hadoop01 flume]$ hadoop fs -ls flume //没有“/”会看不到!!!!
[hadoop@hadoop01 flume]$
[hadoop@hadoop01 flume]$

适合②[使用telnet来发送数据]

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
source[netcat],sink[hdfs]
# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.port = 12321
a1.sources.r1.bind = hadoop01
a1.sources.r1.channels = c1
# Describe the sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.channel = c1
###HDFS的数目路径
a1.sinks.k1.hdfs.path = hdfs://hadoop01:9000/flume
a1.sinks.k1.hdfs.filePrefix = My_netcat_log
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 1
a1.sinks.k1.hdfs.roundUnit = minute
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
接收端:bin/flume-ng agent --conf conf --conf-file conf/netcat.conf --name a1 -Dflume.root.logger=INFO,console
发送端:telnet hadoop01 12345,,,
结果:
[hadoop@hadoop01 flume]$ hadoop fs -ls /flume
Found 13 items
-rw-r--r-- 3 hadoop supergroup 177 2017-03-12 18:09 /flume/My_netcat_log.1489313346930
-rw-r--r-- 3 hadoop supergroup 224 2017-03-12 18:16 /flume/My_netcat_log.1489313794747
[hadoop@hadoop01 flume]$ hadoop fs -ls flume
[hadoop@hadoop01 flume]$
[hadoop@hadoop01 flume]$

适合③[使用curl来发送数据]

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
source[http],sink[hdfs]
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = http
a1.sources.r1.port = 50000
a1.sources.r1.bind = hadoop01
a1.sources.r1.channels = c1
# Describe the sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.channel = c1
###HDFS
a1.sinks.k1.hdfs.path = hdfs://hadoop01:9000/flume
a1.sinks.k1.hdfs.filePrefix = Http_log
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 1
a1.sinks.k1.hdfs.roundUnit = minute
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
接收端:bin/flume-ng agent -c conf -f conf/http.conf -n a1 -Dflume.root.logger=INFO,console
发送端:
[hadoop@hadoop01 flume]$ curl -X POST -d'[{"headers":{"h1":"v1","h2":"v2"},"body":"hello body"}]' http://hadoop01:50000
[hadoop@hadoop01 flume]$ curl -X POST -d'[{"headers":{"h1":"v1","h2":"v2"},"body":"asdascfascas"}]' http://hadoop01:50000
[hadoop@hadoop01 flume]$ curl -X POST -d'[{"headers":{"h1":"v1","h2":"v2"},"body":"xxxxxxxxxxx"}]' http://hadoop01:50000
结果:
[hadoop@hadoop01 flume]$ hadoop fs -ls /flume
Found 16 items
-rw-r--r-- 3 hadoop supergroup 145 2017-03-12 18:49 /flume/Http_log.1489315734229
-rw-r--r-- 3 hadoop supergroup 147 2017-03-12 18:49 /flume/Http_log.1489315785602
-rw-r--r-- 3 hadoop supergroup 161 2017-03-12 18:49 /flume/Http_log.1489315785603