Spammers get your e-mail addresses from webpages, news groups or domain records (if you have your own domain). There are individuals who use robots to extract the addresses, burn them on CDs and sell them very cheap to other Spammers. If you write your e-mail address in clear text onto your homepage today such that programs can extract it, then you will have a major problem in a few months time and you can't stop it. The problem will be growing every day!
Now lets discuss some common filter techniques and how they work. I will not describe how to configure them exactly in each MTA. Instead I suggest you to read the documentation that comes with the MTA that you have installed. Postfix and Exim are well documented
Realtime Block lists:
These are DNS based lists. You check the IP address of the mailserver that wants to send mail to your server against a blacklist of known spammers. Common lists are www.spamhaus.org. You should however not be too enthusiastic about it and carefully choose the lists since there are also some which block entire IP ranges simply because one spammer had used a dialup connection from this ISP at one point in time.
8 bit characters in subject line:
About 30% of the spam origins in China, Taiwan or other Asian countries these days. If you are sure that you can't read Chinese then you can reject mail which has a lot of 8 bit characters (not ASCII) in the subject. Some MTAs have a separate configuration option for this but you can also use regular expression matching on the header:
/^Subject:.*[^ -~][^ -~][^ -~][^ -~]/
This will reject email which has more than 4 consecutive characters in the subject line which are not in the ASCII range space to tilde. Both exim and postfix can be compiled with perl regular expression support. This method is quite good and keeps out 20-30% of the spam-mail.
Lists with "From" addresses of known spammers:
Forget it. This used to work back in 1997. Spammers today use faked addresses or addresses of innocent people.
Reject non FQDN (Fully Qualified Domain Name) sender and unknown sender domain:
Some spammers use non existent addresses in the "From". It is not possible to check the complete address but you can check the hostname/domain part of it by querying a DNS server.
This keeps out about 10-15% of the spam and you don't want these mails anyhow because you would not be able to reply to them even if they were not spam.
IP address has no PTR record in the DNS:
This checks that the IP address from where you get the mail can be reverse resolved into a domain name. This is a very powerful option and keeps out a lot of mail. I would not recommend it! This does not test if the system administrator of the mail server is good but if he has a good backbone provider. ISPs buy IP addresses from their backbone providers and they buy from bigger backbone providers. All involved backbone providers and ISPs have to configure their DNS correctly to make the whole chain work. If somebody in between makes a mistake or does not want to configure it then it does not work. It says nothing about the individual mail server at the end of the chain.
Require HELO command:
When 2 MTAs (mail servers) talk to each other (via smtp) then they first say who they are. Some spam software does not do that. This keeps out 1-5% of the spam.
Require HELO command and reject unknown servers:
You take the name that you get in the HELO command and then you go to DNS and check if this is a correctly registered server. This is very good because a spammer who uses just a temporary dialup connection will usually not configure a valid DNS record for it.
This blocks about 70-80% of all spam but rejects also legitimate mail which comes from sites with multiple mail servers where a sloppy system administrator forgot to put the hostnames of all servers into DNS.
Some MTAs have even more options but the above are quite commonly available in a good MTA. The advantage of all those checks is that they are not CPU intensive. You will usually not need to update your mailserver hardware if you use those checks.