Thanks. As often happens, there is more than the one problem. Here, it is that lines 6-8, which we want to fix, have more potential semicolon ;
delimited variables than we'd like.
I'll use saude.txt
as the name of the file from which the data come. From a terminal session (not the console)
awk -v FS=";" '{print NR,NF}' saude.txt
1 30
2 30
3 30
4 30
5 30
6 19
7 1
8 12
9 30
shows this, and it looks that the correct number is 30
. As a first step, we want to isolate those lines from the incorrect lines
awk -v FS=";" '{if (NF == 30) print}' saude.txt
430005;AGUA SANTA;14;PASSO FUNDO - R17 R18 R19;Feminino;30 a 39;RT-PCR;27/11/2020;26/11/2020;02/12/2020;;RECUPERADO;NAO;SIM;NAO;NAO;NAO;NAO;;NAO;;10/12/2020;BRANCA;NAO ENCONTRADO;NAO;INTERIOR;NAO;E-SUS;BRASIL;NAO
430005;AGUA SANTA;14;PASSO FUNDO - R17 R18 R19;Feminino;20 a 29;RT-PCR;04/12/2020;25/11/2020;08/12/2020;;RECUPERADO;NAO;NAO;NAO;SIM;SIM;NAO;;NAO;;09/12/2020;BRANCA;NAO ENCONTRADO;NAO;VILA RECH;NAO;E-SUS;BRASIL;NAO
430005;AGUA SANTA;14;PASSO FUNDO - R17 R18 R19;Feminino;15 a 19;TESTE RAPIDO;27/11/2020;20/11/2020;02/12/2020;;RECUPERADO;NAO;NAO;NAO;SIM;NAO;SIM;;NAO;;04/12/2020;BRANCA;NAO ENCONTRADO;NAO;INTERIOR;NAO;E-SUS;BRASIL;NAO
430005;AGUA SANTA;14;PASSO FUNDO - R17 R18 R19;Feminino;20 a 29;TESTE RAPIDO;07/12/2020;03/12/2020;09/12/2020;;RECUPERADO;NAO;NAO;NAO;SIM;NAO;NAO;;NAO;;17/12/2020;BRANCA;NAO ENCONTRADO;NAO;CENTRO;NAO;E-SUS;BRASIL;NAO
430005;AGUA SANTA;14;PASSO FUNDO - R17 R18 R19;Masculino;50 a 59;RT-PCR;01/12/2020;26/11/2020;08/12/2020;;RECUPERADO;NAO;SIM;SIM;NAO;SIM;NAO;;NAO;;10/12/2020;BRANCA;NAO ENCONTRADO;NAO;INTERIOR;NAO;E-SUS;BRASIL;NAO
430005;AGUA SANTA;14;PASSO FUNDO - R17 R18 R19;Masculino;40 a 49;RT-PCR;04/12/2020;01/11/2020;16/12/2020;;RECUPERADO;NAO;SIM;NAO;SIM;NAO;NAO;;NAO;;15/11/2020;BRANCA;NAO ENCONTRADO;NAO;CENTRO;NAO;E-SUS;BRASIL;NAO
!1145 > good.txt
awk -v FS=";" '{if (NF == 30) print}' saude.txt > good.txt
and put the others aside
430005;AGUA SANTA;14;PASSO FUNDO - R17 R18 R19;Masculino;60 a 69;TESTE RAPIDO;07/12/2020;23/11/2020;08/12/2020;19/12/2020;RECUPERADO;SIM;NAO;SIM;NAO;SIM;SIM;Doena Cardiovascular Clinica
Diabetes mellitus
Obesidade;NAO;;;BRANCA;NAO ENCONTRADO;NAO INFORMADO;INTERIOR;SIM;SIVEP HOSP;BRASIL;NAO
awk -v FS=";" '{if (NF != 30) print}' saude.txt > bad.txt
Now a decision must be made on which variable in the last line is surplus. Then we can work on fixing it.