Datastage
Datastage
Execute command1cat PathName/FileName | nawk -F"|" '{print $1}' | sort -u | wc -l The above command will open the file and take the distinct count of the first co lumn.In your case it is 4. This count will be the end value for the start loop activity and will determine the number of times the loop should run. Each time it wiil pass a new FileName as parameter and that will helps us to cra te a new file. 2.UserVariable Activity1Here we are formatting the output of the execute command trim(Convert(@FM,"", Execute_Command1Output)) 3.StartLoopActivityFrom=1 Step=1 To=#UserVariable Activity1# 4.ExecuteCommand2Command cat PathName/FileName | nawk -F"|" '{print $1}'| sort -u|sed -n Parameters #StartLoop_Activity_58.$Counter#p Where #StartLoop_Activity_58.$Counter#=The counter of the loop. The above command returns the #FileName# as each loop runs. Eg- cat dsxchange | nawk -F"|" '{print $1}'| sort -u|sed -n 1(#StartLoop_Activit y_58.$Counter#)p will give File1 cat dsxchange | nawk -F"|" '{print $1}'| sort -u|sed -n 2(#StartLoop_Activity_58 .$Counter#)p will give File2 etc 5.UserVariable Activity2Here we are formatting the output of the esecute command trim(Convert(@FM,"", Execute_Command2Output))
6.JobActivityfor calling the job and passing the filename as parameter The Filename parameter of the job=UserVariable Activity2(in the Expression Grid) 7. Endloop Activity- Completing the loop and mapping it back to start loop activ ity. In datastage parallel JobA parrallel job is designed for creating multiple file. seqfile------->filterstage------->seqfile 1.In seq file we can read the source file. 2. in filter stage we can pass the below transformationCol1 like '#FileName#' where Col1 is the first column carrying the file name and the #FileName# is the parameter getting passed from the Job Sequence. We will map only the second column to the target which should be the contents of the file. 3. we are createing the file with the #FileName#.txt where #FileName# we are get ting from the JobSequence as parameter. Now the loop runs in the sequence and will call the job with the #FileName# para meter. Once the job runs the filter stage will filter out the records for the re spective #FileName# from the source file and create separate files in the target for each group of values. ================================================================================ =============== In a parallel job use an External_target stage and use the below code in the pro gram box. Code: cd /your_directory;awk '{field1=index($0,",");print substr($0,field1+1)>substr($ 0,1,field1-1)}' Just tried it out,it should give you the expected result. ================================================================================ ================ se a combination of the a sequential file stage and an external target stage. For the input use a sequential file stage using a file pattern. Include a column for the file name itself. Extract your required values and perform whatever transforms you need to perform . Prior to output concatenate all the fields into a single column with a chosen de limiter. Ensure that a fully qualified file name is the first value.
Pass to an external target stage. I think you would want to look at sorting the incoming data here also by the 1st column. But I am not sure its entirely necess ary. It in the destination program of the external target stage you would set your Ta rget method to Specific program and use the following code. Code: awk '{nPosField1=index($0,",");print substr($0,nPosField1+1)>substr($0,1,nPosFie ld1-1)}' This should create a file with the required values and then append the values fr om the job to the named file. This process can create and write many output file s at a time. ================================================================================ =================== How to use NOT IN condition ?? We can use <> X in where condition in filter st age. (You can also use AND in where condition) ================================================================================ =================== StringToTime('Columnname',"hh%nn%ss.6") ================================================================================ ===================