
We read:
Planet PHP
Planet MySQL
Exciting E-Commerce
E-Commerce Blog
Fischmarkt
fukami
Lars Jankowfsky
Themenblog
Thomas Bachem
Matt Asay on OpenSource
Joel on Software
Ibrahim Evsan
Hasematzel
Techcrunch
Indiskretion Ehrensache
Sichelputzer
Alexander Schwinn
Managing Tech
F-LOG-GE
trycatchfinally
Object-relational mapping (ORM) frameworks have been around for several years now and for some people, ORM is already outdated by now. As we have seen with other technologies and concepts before, PHP is not exactly what we call an early adopter among the programming languages. Thus it took some time for ORM to grow up in the PHP context.
There have been some frameworks before Doctrine 2 that implement ORM (remember e.g. Propel) specific tasks but most of them lack the required maturity to be used in large projects. With Doctrine 2, PHP takes a huge step into the right direction – Doctrine 2 is fast, extensible and easy to use.
This article will take you on a tour through the main concepts of Doctrine 2 in the first part and then explain how to use it in a real world application in the second part. Since at the time of writing Zend Framework 1.11.xx (ZF) is very popular, we will integrate Doctrine 2 into a ZF project.
To understand Doctrine 2, we have to take a look at some relevant terms (or in this case objects), study their behavior and practice their usage. We start with some introductory phrases on ORM systems and then go on to the concepts underlying Doctrine 2: Entity Objects, the Entity Manager, Repositories and Proxies.
Since the beginning of Object-Orientation, people had to manage the persistence of their application's state resp.
their objects. In the context of Web Application Development, this usually involves a Database server which is being
consulted using a Query Language. One example for this pattern is a PHP application that uses some kind of SQL server
by sending SQL queries to it. Another one is an application using a CouchDB server by querying it via its REST API.
Due to the author's laziness, we will talk in terms of relational databases from now on. Keep in mind, that you can
accomplish almost everything mentioned here with NoSQL databases, too.
ORM relates value objects that exist in an application's business logic to database records.
Thus every object that should be persistent is saved in one row of a database table. The most common approach is to
map classes to tables and the classes' objects to rows in the these tables.
Besides writing objects to a database, ORM systems are also intended to ease the process of finding data stored in the
database. When talking in terms of ORM, finding data always means making the framework fetch one or many objects
that meet a certain criteria.
The objects that are being managed by an ORM system are called Entity Objects. Every entity object relates to one entry in a table. In Doctrine 2, the classes that represent entities do not have to fulfill special requirements like inheriting from a certain super class (as you might have seen in other database abstraction frameworks like Zend_Db). When creating a new entity class with Doctrine 2, all you have to do is to write down a regular PHP class with properties. Besides this, you have to provide some hints on how these attributes should be persisted. The information how entity attributes relate to columns in the DB is called Metadata. Metadata can be described in different ways: By default there are metadata drivers for descriptions in XML, YAML and PHP. The fourth and most popular driver is based on DocBlock annotations (since in PHP, annotations aren't a language feature as in Java (see Wikipedia), they are contained by the classes' and attributes' DocBlocks). We will use annotations to describe our entities metadata. To get an impression on how easy this is, take a look at the following example.
This example contains all it needs to tell Doctrine 2 about the new entity User. With this class, you can create,
find, delete and modify user objects and persist their state to the underlying database. But keep in mind: as long
as you don't need any persistence features, you can use your user objects just like any other objects!
The next two objects resp. object types we will describe are responsible for doing the ORM functionality: persisting and finding.
To use ORM functionality, the Entity Manager (Doctrine\ORM\EntityManager) is the main access point to Doctrine 2. The entity manager is
responsible – as you might have guessed – for managing entities and for building a facade for the whole framework.
To accomplish its tasks, the entity manager uses some helpers. The Unit of Work object for example collects entities
that should be written back to the database and is capable of doing this in batches. This way, database operation
can be executed with almost no overhead and therefore are really fast.
Another dependency of the entity manager is the Event Manager. To be as extensible as possible, Doctrine 2 comes with an event system that publishes all important state changes to the outside as events. You can register for such events and extend the life cycle of your entity objects at one single point.
The entity manager's API combines methods for managing entities (find, persist, contains, copy, detatch, merge,
remove and refresh), methods that control the use of transations (beginTransaction, commit, flush, rollback and
transactional) and some helper methods for creating custom queries and accessing some of the entity manager's dependencies.
The following example shows how to query an object from the entity manager, modify it and write the changes back into the database.
Creating a new persistent object is almost as easy as modifying it:
For finding entities, Repositories are used. Every entity class has its own repository which is responsible for finding entities of that type. By default, repositories have some handy methods for fetching entities that match certain criteria:
find: Finds an entity by its primary key / identifierfindAll: Finds all entities of the repository's entity typefindBy / findOneBy: Finds all resp. one entity that matches the passed criteria:findBy<attribute> / findOneBy<attribute>: Magic methods that ease the filtering by a single attribute:
To access a repository, all you have to do is ask the entity manager for one. If you have implemented your own
repository, it will be returned by Doctrine\ORM\EntityManager::getRepository(). Otherwise, Doctrine 2 will provide
a generic repository. The main reason to implement custom repository classes is to group custom queries for an entity
type to make them reusable. For custom query logic, there are several mechanisms you can use: You can either use
Doctrine's query builder that implements an API similar to Zend_Db_Select or queries written in the Doctrine Query
Language (DQL) or you can even execute plain SQL queries. With these options, it is also possible to migrate old
applications which use complex queries by just wrapping these queries into the methods of custom repositories.
When traversing a graph of entity objects (which is required when entities are having relations to other entities),
it would be very expensive (in the sense of “requiring many database queries”) to fetch every depending entity with
an additional query. Therefore Doctrine 2 uses the concept of Proxy objects that represent regular entity objects
which have not been populated with data from the database. Take a look at the following example where the entity
Group aggregates a list of User objects in its member property. When accessing the members list, Doctrine 2 provides a
collection of proxy objects instead of complete User objects. When an object of this collection is being asked for
one of its properties, Doctrine loads the object's data from the database. This way, the users' data is not loaded
until it is really needed.
This section describes some advanced concepts that are required when mapping entity classes that have relationships to
other entity classes. Possible relationship types are association and inheritance. Inheritance is the mechanism used for
representing subtypes in object-oriented programming languages. An example would be a class User that implements
methods every user of a software should have and a class Administrator inheriting from User that adds methods for
determining the administrator's access rights.
Association is a weaker relation type. It means that an entity object can be related to other entity objects of other types. In terms of relational databases, there are three types of association which differ in the number of entities an object is related to: 1:1, 1:n and n:m relationships. n and m are placeholders and mean multiple.
To put objects of an entity type into relation, you just have to mention this relation in the entity class' mapping
information. The simplest case is a unidirectional 1:1 relationship. In the following example we describe a User entity
which has its access information (user credentials) encapsulated into another entity class called UserCredential. Since
every user has at most one credential object and every credential object may only be associated to one user object,
this is a 1:1 relationship.
If the relationship should be bidirectional, include the OneToOne attribute in the other class, too, and add an
attribute which denotes the attribute of the other entity that mapps the related object:
This way, you can access the user object from the credentials object, too.
Most of the times, developers have to deal with relationships which include many objects on at least one side.
These relationships are called 1:n or n:m relationships. This means that either one or multiple entities are standing
in relationship with an arbitrary number of entities of another type. To accomplish this, you have to use the mapping
keywords OneToMany or ManyToMany when describing your entities. Besides that, the mapping works the exact same way as
with 1:1 relationships.
There are however some tricks you should know when dealing with collections of associated entity objects. Consider
the following relationship between the entity classes User and Group:
When a group has at least one member, the group object will have a collection of the type
Doctrine\Common\Collections\ArrayCollection set as its members property. This collection contains
all user objects (or proxy objects as we have seen before) and can be modified intuitively with the methods add and
removeElement. To honor object-orientation, you might want to introduce custom methods for these tasks. If you do so,
you get into trouble when the group object does not have any users associated. In this case, the collection will simply be
set to null. To avoid checks whether the collection has already be initialized, you should to this by yourself
in the entity class' constructor:
It is also important to notice that one entity has to update the other entity's state as well when a relationship between to objects is created or removed. Take care to do this only in one class to avoid endless recursion loops! This class is called the Owning Side of the relationship. When implementing a bidirectional relationship, the other class is called the Inverse Side. It is important to determine owning and inverse side and implement the the classes accordingly to avoid greater trouble during debugging.
There are some more features implemented by Doctrine 2 enabling developers to specify their entities' relationships including sorting, pre-fetching and indexing. These topics are not covered in this article but are explained very understandable in the Doctrine 2 documentation.
Subtyping can be implemented in different ways using Doctrine 2. The main difference between these implementations is how the inheritance is mapped to the database. The options are to have one table for every class (Class Table Inheritance), to have one table for all classes in a hierarchy (Single Table Inheritance) and to have a table for every specialized sub-class of a given super-class (Mapped Super Class).We will give a short overview on all three alternatives, you have to pick the right one yourself. This decision should be made based on how many common attributes there are in your sub-classes.
Introducing a mapped superclass is probably the easiest way for specifying inheritance but might lead to many duplicate columns in your database schema. The superclass of your entities is not being declared as an entity itself (and might also be declared abstract) but provides attributes and optionally methods that will be available in all subclasses. When creating the database schema, Doctrine 2 merges all attributes and relationships of the superclass into the definitions of the subclasses and processes them as regular entities.
After creating the database from this mapping information, your tables will look like this:
When having entities that are very similar besides some few attributes, you might want to store them together in one database table. This approach is called Single Table Inheritance. To distinguish between the different types, there is always a column marked as discriminator column and a discriminator map that tells Doctrine 2 which values in the discriminator indicate what entity types.
These definitions cause the existing of one single table called User with all the attributes declared inside the classes User and Administrator plus a column type – the discriminator column. When working with entities of these types, Doctrine will manage the type flag automatically for you.
The resulting database schema looks as illustrated by the following diagram:
Having each entity type stored in its own table is always good for keeping your schema extensible. When you have to create a new subtype, Doctrine 2 will just create a new table for this type and it can inherit the logic and common attributes of a superclass. The only overhead you have with this approach is that all tables that correspond to subtypes have to maintain a relationship to their supertype's table. Using class table inheritance, the example with the entities User and Administrator looks like this:
Besides the inheritance type, there is no difference to the example using single table inheritance. The outcome on the resulting database scheme is huge. Now you have to separate tables which store users and administrators. Every record in the table Administrator has a corresponding record in the User table.
This was the first part of this article. Stay tuned for part II which will be published tomorrow (on 6th of December 2011)! In the second part, we will integrate Doctrine 2 into a Zend Framework application and include a generic sandbox (ZF-)project with Doctrine 2!
During the development of an application, not all time is spent on writing code. A lot of time is spent on reading debug output, crawling through log files and firing up the debugger to figure out what the application does. While the debugger helps us to inspect details of a running application on a testing environment, logfiles are often the only indication of the origin of an error on a production system. In this blogpost I want to describe how to log SQL statements on an existing application without touching any existing line of code at all. We will use a new MySQLnd Extension developed at the Mayflower OpenSource Labs for that purpose.
As an example, I will use PHProjekt 6. The project is particularly suitable for demonstration purposes as it has a logging infrastructure for function calls, but does not log SQL statements.
Am kommenden Donnerstag, den 11.03.2010 findet wieder ein öffentlicher Vortrag im Mayflower Büro in München statt (Mannhardtstraße 6, S-Bahn Isartor). When I talked with journalists, lawyers and analysts about the Oracle/Sun merger case questions were raised about the possibility to fork MySQL and that everybody who is not satisfied with Oracle's future way regarding MySQL could do this. I don't agree with that and I think it's best to put Monty's own words (found in a comment in his blog) here because I can't explain it better:
In addition, the MySQL trademark is so strong that it's hard to impossible for a fork to attract enough attention to be able to compete in a meaningful manner if MySQL would be owned by a vendor that refuses to cooperate and works against the fork.
These are tough days in the case of the Oracle/MySQL decision the EU faces. First of all, the lobbyists of Oracle achieved that the decision deadline will be extended from January, 19th to January, 27th 2010. Secondly, Monty recommended that a license change from GPL to BSD would be a great idea for MySQL's future.
Today, Johann pointed me to a document called "Project Peter" which can be found at wikileaks.org (download PDF from wikileaks.org server in Sweden). It's a presentation of MySQL's Robin Schumacher. You may ask "What is Project Peter?". The presentation says:
Project Peter is an internal effort to assist Sun/MySQL customers in migrating from Oracle to MySQL by offering them a comprehensive solution that consists of Professional Services, Best Practices, and a set of approved third party migration tools and utilities that will enable them to move to MySQL in a way that is as easy as possible.
Marten Mickos, former CEO of MySQL, tweeted some time ago about an interview in eWeek where he was asked if Oracle and MySQL compete directly against each other. On page 2 of this interview, he claims that certainly Oracle and MySQL compete to each other:
"MySQL most certainly competes with Oracle," Mickos said. "And successfully so. But what must be remembered in terms of dollars in that competition, it is not significant enough to warrant an antitrust consideration. Secondly, this competition happens partly outside of the business—in the free, installed base.
"So no matter who owns MySQL, the competition will continue to exist."
Even if Oracle does ultimately own the MySQL code base and act as the enterprise headquarters for the database, "MySQL will still apply price pressure on Oracle," Mickos said. "That won't change. This is why there's no reason to stop the acquisition."
Asked about the future of MySQL, Mickos claimed: "The MySQL business is a very strong business, with enormous potential in the next 10 to 20 years."
So, maybe MySQL doesn't compete in terms of dollars today. But if MySQL does have a bright future in the next 10 or 20 years, there's evidence that numbers will climb up in the era of the "database for the web". So that's why there's Project Peter for the sales force of Sun to try to convert Oracle customers to MySQL. I'm not sure if Oracle will accept a Project Peter if Oracle will own Sun and MySQL in the future - I guess they'll shut down Project Peter because MySQL may be kind of a threat to Oracle's business in certain areas.
And this is why Oracle mustn't own MySQL.
According to Yahoo News (and WSJ, only for subscribers - thanks to @Oswald for mentioning it on twitter, see also Reuters), the EU opens an in-depth probe to the Oracle-Sun deal.
According to the source, EU Competition Commisioner Neelie Kroes said:
"The (European) Commission has an obligation to ensure that customers would not face reduced choice or higher prices as a result of this takeover,"
Furthermore, the commision set a January 19, 2010 deadline for its decision, Yahoo News said.
PS: U.S. already approved the deal.
The German Oracle User Association (DOAG e.V.) has published a statement (in German) about the acquisition of Sun/MySQL by Oracle and its impact for Oracle users. You can find the statement here.
Oh and btw, I'll give a session about "PHP5 & Oracle" at the local Oracle usergroups in Frankfurt on June, 23rd and Hamburg on Sep 14th. The main goal is to promote the usage of PHP5 in Oracle environments (and how you can leverage PHP's potential in Enterprise environments) as there are good Oracle database connectors for PHP5 available. See you there!
IPC is over. My impression: The place was too big making it a little bit difficult to get in contact with others. Yet, from a technical and gastronomical point of view the Rheingoldhalle was a good choice. For the next IPC I would recommend to anker a hotel-ship near the hall (the Rheingoldhalle is situated at the bank of the river Rhine) to avoid a 30min shuttle bus ride from and to the hotel. ;-)
But back to my talk there.
The initial idea to this session was a performance consulting in spring this year: For a table with appr. 250 billion entries I found a way to store and read about 6,000 queries per second! I applied some very unusual ways to speed up a problem by factor 1,000 or 2,000 just by thinking about how I would do it, if I had to store the things in my home supposed they were real things such as cutlery.
I found out, that there are some patterns, which can be used in general and that they work for nearly every problem with very big tables. Just see for yourself how I solved the problem.
Please note: The slides could probably not be understood without explaining some of the ideas. Be free to post your questions as comment!
PS: The "C" in the pictures is the "catalog".
PSS: YES, we've uploaded a new slideshare, the pictures are now working.
Certifications are "in". Nowadays you can get certifications for almost every aspect of life. Admittedly, some of those certs you can just get by surviving a boring day in a classroom or more luckily for having joined a 2 week 20k yacht trip offshore hawaii that was just regularly interrupted by attending conference speaches, workshops or lessons.
One of the workshops on our Barcamp two weeks ago had to do with the MySQL-Proxy from Jan Kneschke.
Yet, we found out, that the proxy is rather unusable for our task. Read here why.
Mit max_clients definiert man bei einem MySQL Server, wie viele Client Connections die Datenbank maximal akzeptiert. Wird diese Anzahl erreicht, können keine weiteren Verbindungen zur Datenbank gemacht werden und MySQL schmeißt uns den Fehler "Too many connections" zurück.
Bei großen Applikationen wie zum Beispiel Communities oder Shopsystemen kommt es durchaus vor, dass man nicht nur die Webanwendung, vulgo das PHP hat, die auf die Datenbank zugreift. Unter Umständen gibt es noch Backend-Systeme, Batch-Prozesse, Shell-Scripte und ähnliches, die sich ebenfalls zur Datenbank verbinden wollen. In Summa greifen da also ganz schön viele verschiedene System auf die Datenbank zu. Nun muss man noch wissen, dass bei einer typischen Konfiguration aus Apache + PHP pro httpd Prozess in der Regel mindestens eine Datenbank-Verbindung offen ist (es sei denn man versucht sich mit einem Connection Pooling über SQLRelay oder MySQL Proxy). Das schöne an der "shared nothing" Architektur von PHP ist ja, dass am Ende des Requests die Verbindung zur Datenbank wieder geschlossen wird, also wieder ein Connectionslot frei wird.
Um also zu verhindern, dass es in Spitzenlastzeiten zu der oben erwähnten Fehlermeldung kommt, muss man über alle Architekturbereiche hinweg berechnen, wie viele Verbindungen maximal zur MySQL Datenbank gleichzeitig erreicht werden könnten. Diesen Mindestwert muss man dann als max_connections eintragen. Zu Problemen kann es kommen, wenn man langlebige Prozesse (Batch-Applikationen oder irre gelaufene Prozesse, die wirr im System "stehen", Rekursionsbugs, die n Verbindungen ungeplant öffnen usw.) hat, die entsprechend viele Verbindungen zur Datenbank "besetzen", ob nun gewollt oder ungewollt.
Zu Problemen kann es aber auch kommen, wenn man Peaks in den Seitenaufrufen hat. Da ja ein httpd Prozess dann mindestens eine Verbindung zur Datenbank aufmacht, sollte über die Apache Konfigurationsdirektiven, die die Anzahl der maximalen httpd Prozesse festlegen, die Gesamtzahl der Datenbankverbindungsslots konfiguriert und damit geregelt werden.
Kollege Köhntopp ergänzt noch, dass zum Debugging ein "SHOW FULL PROCESSLIST" + date ein mal pro Minute in ein Logfile eine Möglichkeit wäre, um zu monitoren, ob es Überläufe gibt. Ich meine, dass die Non-Bastelvariante, der MySQL Enterprise Monitor (Codename Merlin) aus der MySQL Enterprise Distribution, ebenfalls geeignete Mechanismen fürs Monitoring und Logging zur Verfügung stellt. Zusätzlich sollte man bedenken, dass ein max_clients auch immer noch mit einem open_files_limit einhergeht, weil jeder offene Socket natürlich auch ein Filehandle belegt. Eine Tabelle belegt 2-3 Filehandles, somit ergibt sich die knackige Formel open_files_limit = (max_clients + table_cache * 2 bis 3).
Der von mir sehr geschätzte Kollege Jo Brunner hat mal wieder tief in die Trickkiste gegriffen, und im Rahmen seiner Xenjo Cache Registry den wertvollen Urlaub dafür verwendet (Chapeau!), sich ausführlich mit lighttpd, mod_magnet und lua zu beschäftigen. Näheres findet man auf seinem Blog...
Überhaupt lohnt sich ein Blick auf all das, was Jan Kneschke (der Autor von lighttpd) so baut. Wer mit MySQL einiges an Schweinereien vorhat (zum Beispiel Sharding, Verteilung von DB-Queries auf Master & Slaves), der sollte sich im Übrigen MySQL Proxy ansehen. MySQL Proxy ist ein Query Interceptor und sitzt zwischen dem Client (also zum Beispiel dem PHP) und dem Datenbankserver. Gesteuert wird MySQL Proxy wie üblich mit lua, der kleinen wendigen Scriptsprache. Man kann also Queries on the fly umschreiben oder auf andere DB-Server "redirecten". Ein Vorteil dabei: man muss seinen Anwendungslayer nicht anpassen, um die INSERTs/UPDATEs und DELETEs auf den Master zu richten und die SELECTs auf die Slaves - das erledigt MySQL Proxy mit einem passenden kleinen lua-Script.
Hi Folks,
This is an announcement for a webinar in German. Therefore only written in German. If you are interested in the security topic be sure to see the english webinar, which is stored here.
Die verbesserte Einsatztauglichkeit der Web-2.0-Anwendungen wird auf Kosten von neuen Sicherheitsproblemen erworben. Sowohl die mächtige Logik im JavaScript als auch der permanente Login auf vielen Sites bergen Risiken, die anders und gezielt beantwortet werden müssen. Dieses Webseminar gibt einen Überblick, bewertet die Probleme und stellt Lösungswege vor.
Wenn Sie Web 2.0- und AJAX-Anwendungen entwickeln, ist dieser Vortrag genau das Richtige für Sie! Hier erfahren Sie:



















